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INTRODUCTION 


The SHARP APL Statistical Analysis Library is a collection of more than twenty workspaces. Each 
of these workspaces contains a number of functions which can be used in such areas of statistics as 
analysis and design of experiments, data reduction, time series analysis, and so on. 


The main function and all related auxiliary functions and variables which are necessary for a given 
application are grouped together in a single workspace. This facilitates the assembly of the elements 
needed to solve a particular problem. 


The Statistical Analysis Library exhibits several characteristics of convenience to users: program 
modularity, the separation of input/output routines from calculation routines (which have no input/ 
output), the conditioning of calculation routines through state setting functions, and printing of input 
and output in a format suitable for reports. The library is designed in such a way as to be applicable 
to a variety of user environments. For example, depending on the requirements of the user, the 
functions contained in a particular workspace can be used in any of the following ways: 


- in calculator mode, where the emphasis is on exploratory and one-time use. 
- in defined function mode, where statistical routines must integrate easily into a user system. 


- in conversational mode, where the inexperienced user is led conversationally through the 
process required for an application. 


- in non-conversational mode, where the experienced user can obtain quick answers. 


APL is a language well suited to the development of statistical functions. Although it is not necessary 
to be an experienced APL programmer in order to use many of the functions contained in the 
Statistical Analysis Library, a cursory knowledge of the language will prove helpful in most cases. 
Those interested should refer to the I.P. Sharp manual “An Introduction to SHARP APL” and/or 
the book entitled *APL: An Interactive Approach" by Gilman and Rose. 


Many of the functions in this library have either been contributed from other sources or drawn from 
the public domain. Particular acknowledgement is paid to Mr. Knight of University of New Bruns- 
wick, Dr. Snyder of Bar-Ilan University, Dr. McLaughlin of Syracuse University, Dr. Smillie of 
University of Alberta, Mr. Wheeler of DuPont de Nemours and Mrs. Gibson, Mr. Maxwell, and 
Mr. Swaminathan of University of Guelph. 


HOW TO USE THIS MANUAL 


The Statistical Analysis Library is currently comprised of the six libraries listed below. As indicated, 
each library contains a number of workspaces. Each workspace, in turn, is comprised of a number 
of related functions. 


General Statistical Functions 


31 CROSSTAB This package performs crosstabulation such as could be required in analysis 
of response to questionnaires. It can handle up to 15000 questionnaires. Cros- 
stabulations on a large number of dimensions is possible. Results are available 
in raw form for possible further processing or neatly formatted for printing. 
After the setup phase, generation of results may be in either interactive or 
non-interactive mode. A convenient English-like query language is provided for 
specifying the desired crosstabulations. For additional information concerning 
this package, see the manual entitled “CROSSTAB - A Crosstabulation Pack- 
age in SHARP APL”. 


31 CTAB Functions in this workspace perform 2- or 3-way crosstabulations, as well as 
producing various test statistics concerning contingency tables. Such statistics 
include chi-squares for contingency tables of various sizes, Cramer's statistic, 
the Fisher exact probability and the Goodman-Kruskal gamma coefficient. 


31 DSTAT This workspace provides the user with functions to compute such descriptive 
statistics as sample size, mean, variance, standard deviation, standard error of 
the mean, mean deviation, median, maximum, minimum, range, mode, skew- 
ness and kurtosis. 


31 ENTRY Functions included in this workspace provide the capability to enter numeric 
data easily, correct it, format it into tabular form, and print it with optional 
titles. А caption and multi-lined column headings and row labels may be 
included. Conversational and non-conversational tabulation routines are avail- 
able. 


31 PREQUENCY This workspace contains functions for frequency classification of data, calcula- 
tion of n-tiles, production of histograms and tables, and computation of par- 
ameters of best fit to normal, binomial or Poisson distributions. 


31 MISC The workspace 31 MISC provides the user with a number of miscellaneous 
tools for use in statistical applications: weighted moving averages, rounding, 
and permutation and combination generators. 


31 NONPARAMETRIC Тһе functions in this workspace calculate a wide variety of non-parametric 
statistics. Included are binomial tests, analyses of variance, rank correlation 
coefficients, coefficients of concordance, tests for trend, change, runs and signs, 
and median tests. 


31 PARAMETRIC 


31 TTEST 


This workspace contains functions to perform Bartlett’s and Duncan's tests on 
user data. The first tests homogeneity of variances for normally distributed 
populations; the second is a multiple range test on the differences of a set of 
treatment means with equal or unequal sample sizes. Also available is a 
function to perform linear comparisons among population means using 
Scheffe’s method for multiple comparisons. 


Functions in this workspace perform 1- and 2-sample t-tests, as well as com- 
puting the t-statistic for a single population and testing the population mean 
when the variance is unknown. 


Model Parameter Estimation 


32 CORRELATION 


32 PILEREG 


32 LAGS 


32 NONLINEAR 


32 NPE 


32 PROBIT 


32 CORRELATION contains functions to produce correlation matrices and sim- 
ple and partial correlation coefficients, 


This workspace performs large-sample simple or multiple regression on data 
stored in SHARP APL files. 


This workspace provides a means of performing both polynomial-distributed 
(Almon) and Shiller lag analyses, and is particularly useful in time series 
analysis. 


Тһе functions in this workspace compute parameters of multivariate non-linear 
models using Marquardt's algorithm, which is a combination of the method 
of Gauss linearization and the method of steepest descent. Extensive statistical 
results may be obtained. As a special case, this workspace contains a function 
to evaluate parameters of the Gompertz equation. 


Тһе functions in this workspace perform nonlinear parameter estimation for 
a general class of models. Models may be sets of simultaneous equations, i.e., 
they may contain more than a single equation, and a particular parameter may 
appear in more than one equation. Equations within models may be multivar- 
iate and possibly nonlinear in their parameters and/or variables. 


Gauss's method is used to calculate least squares or FIML (Full Information 
Maximum Likelihood) estimates of all parameters appearing in the model. 
Models may be specified in reduced form or in structural form, and inequality 
constraints may be imposed upon the estimation procedure. Selection of all 
options is controlled by means of state setting functions. 


For a detailed description of the functions and examples of their use, see the 
manual entitled *Nonlinear Parameter Estimation", available from your LP. 
Sharp representative. 


This workspace performs probit analysis on quantal data in biological assay. 


32 REGRESSION 


32 STEPWISE 


The workspace 32 REGRESSION contains functions to perform the estimation 
of model parameters by means of ordinary or generalized least squares regres- 
sion techniques. In addition to the fundamental least squares routines, exten- 
sions include instrumental variables substitution, two- and three-stage least 
squares analyses of sets of simultaneous linear equations, and autocorrelation 
adjustments according to the methods of Hildreth-Lu and Cochrane-Orcutt. 
Optional output features for all programs are controlled by means of state 
setting functions. 


This workspace contains conversational programs which perform stepwise lin- 
ear regression. Entry of variables into or their removal from the model may 
be under either computer or user control. Data may be in the workspace or 
on file. 


Probability Functions and Distributions 


33 PROBDIST 


33 RANDOM 


Analysis of Variance 


34 ANOVA 


34 COVARIANCE 


Workspace 33 PROBDIST contains functions for computing values from many 
common cumulative distribution functions and their inverses, as well as values 
from some probability density functions. Those whose corresponding density 
functions are continuous include the chi-square, normal, F- and t-distributions. 
Those with discrete density functions are the binomial, Poisson and hyper- 
geometric distributions, distributions of runs and matches, and the distribution 
of the signed-rank test. 


Functions in this workspace generate random samples from populations 
governed by the following distributions: Cauchy, exponential, type I extreme 
value, Laplace, logistic, normal, uniform, beta, chi-square (and non-central 
chi-square), F (and non-central F), gamma, geometric, hypergeometric, log- 
normal, negative binomial, t (and non-central t) and Poisson. 


Workspace 34 ANOVA provides the user with a number of functions for per- 
forming analysis of variance. Several of the functions have conversational coun- 
terparts to lead the user step by step through an analysis, although non- 
interactive state setting functions are also provided. 


Included in this workspace are functions for the following types of experimen- 
tal design: crossed and/or nested with any number 22 of levels of any number 
22 of factors, one-way with unequal subclass numbers, 3-factor experiments 
with Latin square design, N-factor experiments with more than 1 level per 
factor, blocked or completely random design, with or without missing values 
or replications, and with a standard or nonstandard error term, split plot or 
split-split plot designs with a completely random or a randomized complete 
block design. In addition, there is a function to produce means tables and 
profiles of factor interactions for N-factor experiments. 


This workspace performs analysis of covariance for a single variable of clas- 
sification with several populations. Various statistics are produced and a num- 
ber of hypotheses about the data are tested. 


34 ORTHOGONAL 


Multivariate Analysis 


35 DISCRIMINANT 


35 PRINCIPAL 


Тһе functions in this workspace are intended for the analysis of orthogonal 
factorial experiments with up to six factors, one of which may represent 
replications, and an arbitrary number of levels of each factor. The analysis of 
variance table gives the degrees of freedom, sum of squares and mean square 
for each main effect and interaction term. Specified terms may be pooled to 
accommodate designs which are not complete factorials, such as split plot 
designs. Orthogonal single degree of freedom contrasts may Бе calculated for 
any main effect or interaction term. Finally, a method is provided for obtaining 
the analysis of variance table for any crossed, nested, or partially nested design 
from an algebraic specification of the appropriate linear model. 


For a complete account of the functions, together with examples of their use, 
see the publication entitled "Some APL Algorithms For Orthogonal Factorial 
Experiments” - K.W. Smillie, Publication No. 18, Department of Computing 
Science, University of Alberta, 1969. 


The workspace 35 DISCRIMINANT contains functions which perform a dis- 
criminant analysis. The functions construct a framework by which multivariate 
observations can be classified into one of a set of pre-specified populations. The 
variables to be used in the analysis can be specified directly by the user or 
selected in a stepwise manner. For both of these variable selection methods, 
two approaches may be taken to define the discriminating framework: 


- Minimizing the mean probability of misclassification, or 
- A "canonical variables" approach. 


An additional feature of the workspace allows the classification of any number 
of multivariate observations (of unknown origin) into one of the populations, 
once the discriminating framework has been established. 


This workspace contains functions to perform two types of multivariate statis- 
tical analysis: a principal component analysis and a canonical correlation analy- 
sis. These two techniques are both used to obtain a more parsimonious descrip- 
Чоп of the variability which exists in multivariate data sets. The principal 
component analysis finds linear combinations of one set of variables which 
account for as much variance as possible within that variable set. The canonical 
correlation analysis finds linear combinations of each of two sets of variables 
with the aim of accounting for a maximum amount of the relationship between 
those two variable sets. 


Time Series Analysis 


39 BOXJENKINS 


Both of these analyses can be based upon a covariance matrix or a correlation 
matrix, either of which is estimated from the input data. The output in both 
analyses lists the coefficients of the corresponding linear combinations, starting 
with those combinations which provide the largest explanation of the variability 
inherent in the original data. This workspace also allows for the evaluation 
of the linear combinations at each of the original data observations and for the 
estimation of the correlations between the original variables and the resulting 
linear combinations of these variables. 


The workspace 39 BOXJENKINS performs the analysis and forecasting of time 
series according to the Box-Jenkins technique. This technique is a sound 
analytical means of achieving forecasts in various areas of planning, as well 
as in purchasing, marketing and on-line process control. Common examples 
of its use include applications in sales forecasting and production planning, as 
well as in the design of optima! control schemes for which the system must 
first be analysed and modelled. Using a logical succession of three stages, a 
comprehensive analysis is performed on a time series which has been sampled 
at discrete equally spaced time intervals. The fundamental assumption underly- 
ing the technique is that future values for a given time series can be determined 
using the mathematical formulation (stochastic model) which has been shown 
to describe the behaviour of the series in the past. Given this assumption, the 
primary objective in any time series forecasting analysis is to obtain a stochastic 
model which can be demonstrated to have generated the existing observations. 
As mentioned above, the Box-Jenkins procedure can be categorized into a 
logical succession of three distinct stages: 


- Model identification (and preliminary estimation) 
- Model estimation 
- Forecasting 


The first stage assists in the identification and preliminary estimation of an 
entire class of models, one of which is assumed to have generated the series. 
This model is estimated quantitatively in the second stage. Forecasts are finally 
generated, based upon this model, in the third stage. The SHARP APL im- 
plementation of the Box-Jenkins technique guides the user through the entire 
process, from the model identification stage to the generation of forecast sets. 
Upon the completion of the model identification, preliminary estimation and 
model estimation stages of the procedure, the functions ask whether or not the 
user is ready to go on to the next stage in the analysis. If an affirmative 
response is given, the package continues with its direct execution. As an alter- 
native to this comprehensive analytical process, any or all of the three stages 
can be accessed directly by executing the functions IDENTIFY, ESTIMATE ог 
FORECAST, respectively. For example, this alternative approach might be taken 
if an analyst is confident that the model which generated the original series 
is known beforehand. In this case, only the resulting forecasts would be of 
interest, and only the function FORECAST need be executed. 


For details and examples concerning the use of this package, see the manual 
entitled “Box-Jenkins in SHARP APL”, available from your I.P. Sharp repre- 
sentative. 


39 MASSAGER 


39 TRANSFERFNS 


Functions in this workspace comprise a time series manipulation package calied 
Massager Plus. Data manipulation techniques were selected as being represen- 
tative of those operations on time series most likely to be of use in conjunction 
with the public socio-economic data bases available in SHARP APL, although 
they are also intended for possible use with users' private data bases. 


Operations performed include the calculation of certain descriptive statistics, 
change of timeframe and series periodicity, calculation of correlation matrices 
and series autocorrelation, detrending, weighted moving sums, exponential 
smoothing, and linear interpolation. 


Massager Plus is essentially a subroutine package for time series data man- 
ipulation and does not address problems of data access or report generation. 
In contrast, our MAGIC time series system is highly integrated, serving all 
three purposes, but therefore lacking some of Massager's modularity. The two 
systems are complementary and together span a wide range of requirements. 


For additional information concerning either of these two time series manipula- 
tion packages, see the manuals "SHARP APL Massager Plus" and "MAGIC 
for Time Series Analysis", both available from your І.Р. Sharp representative. 


Тһе workspace 39 TRANSFERFNS performs the analysis and forecasting of 
transfer function models. Representing in effect the bivariate (multivariate) 
extension to univariate Box- Jenkins ARIMA models, transfer function models 
are useful in describing the behaviour of industrial processes and of economic 
and business systems. 


The structure of the workspace is identical 10 that of the workspace 
39 BOXJENKINS, the procedure being separated into the following three dis- 
tinct stages: 


- Model identification and preliminary estimation 
- Model estimation and diagnostic checking 
- Forecasting 


As is the case with the univariate analysis routines in 39 BOXJENKINS, there 
are three functions (called IDENTIFY, ESTIMATE and FORECAST) which can 
be executed by the user. The three functions in this workspace, however, are 
dyadic rather than monadic, the left argument representing the input series and 
the right argument the output series. Again, the user can execute any of these 
directly, or, alternatively, be led conversationally through the entire procedure, 
beginning with the function IDENTIFY. 


For additional information concerning this workspace, see the manual entitled 
“Box-Jenkins in SHARP APL”. 


39 X11 The functions in this workspace perform the seasonal adjustment of monthly 
or quarterly time series according to the X-11 Variant of the Census Method 
II Seasonal Adjustment Program (U.S. Department of Commerce, Bureau of 
the Census). Selection of all available output features is controlled by means 
of state setting functions, as is the selection of special adjustments designed to 
account for strikes, holidays and trading-day variation. Adjustment may be 
performed as a T-task, N-task or B-task, and a number of different series may 
be adjusted for seasonality by means of a single task submission. 


For additional information concerning this package, see the manual entitled 
“Х-11 in SHARP APL”. 


This manual is divided into five main sections with each section corresponding to one of the first five 
libraries listed above. At the beginning of each section, the workspaces contained in that library are 
listed. Essentially, this list can be used as a table of contents for the section. 


Each section is then set up as follows. The first workspace listed for the library is described (i.e., 
the functions contained in the workspace are listed and briefly described, and a reference to the 
corresponding on-line documentation is provided). Each function is then documented in detail (i.e., 
function syntax is given, examples and references are provided, and so on). All other workspaces listed 
at the beginning of the section for that library are documented in a similar manner. 


In other words, the documentation is broken down by library, then by workspace within the library, 
and finally by function within the workspace. 


A consistent convention for the grouping of functions has been followed throughout the Statistical 
Analysis Library. Within each workspace, there exists an APL group corresponding to every function 
in that workspace which is described in this manual. The group name is simply the appropriate 
function name following the letter С. (For example, the group GREGR corresponds to the function 
REGR in the workspace 32 REGRESSION.) Each group contains all objects (functions and variables) 
required for a successful execution of the function in question, and as such facilitates the copying of 
programs from public library workspaces into user's private workspaces. 


Error messages and possible corrective actions are covered in Appendix A. A glossary of terms and 
a bibliography are provided in Appendices B and С, respectively. An index appears at the end of the 
manual. 


HOW TO ACCESS A PARTICULAR WORKSPACE/FUNCTION 


After signing on to SHARP APL, you immediately have available to you an active workspace. This 
is a fixed-size block of storage in which you can store your programs and data and perform any 
required processing. 


There are three system commands which can be used to transfer material into the active workspace: 
)LOAD, )COPY and )PCOPY. These commands are described below. 


)LOAD libnum workspace 


1. Transfers a copy of the entire contents of the workspace specified with workspace from the 
library specified with libnum into the active workspace. 


2. Overwrites completely the contents of the active workspace. 
3. Тһе whole workspace must be transferred. 


4. Тһе active workspace is given the name specified with workspace. 


)COPY or )PCOPY libnum workspace 


1. Transfers a copy of the contents of the workspace specified with workspace from the library 
specified with libnum into the active workspace. 


2. Adds to the contents of the active workspace, rather than completely destroying its previous 
contents. 


3. All or part of a workspace can be transferred. For example, )COPY 31 ENTRY INPUT indicates 
that you want only the function INPUT to be copied from the workspace ENTRY in library 31. 


4. Leaves the active workspace's name unchanged. 


Note that with all of these commands only a copy of the workspace is transferred, while the library 
copy remains unchanged. 


Тһе most important difference between ) LOAD and )COPY is that )LOAD completely overwrites the 
contents of the active workspace, destroying all that was previously resident therein. )COPY augments 
the contents of the active workspace with all specified objects (functions and variables) from the 
specified workspace. 


The only circumstance in which )COPY destroys contents of the active workspace is when an object 
of the same name exists in both the active and copied workspaces. To obviate this possibility, use the 
command )PCOPY (protective copy) rather than )COPY. This copies into the active workspace only 
those objects in the specified workspace whose names are not already in use in the active workspace. 


For example, 10 access the workspace for analysis of variance, enter 


)LOAD 3% ANOVA 
SAVED 16.05.01 06/09/78 


Now the functions described in the documentation for the workspace 34 ANOVA are available in your 
active workspace. If you want to simply add to the contents of the active workspace, enter 


)PCOPY 3% ANOVA 


"To add to your active workspace only those objects (subfunctions and global variables) required for 
a successful execution of the function NWAYANOVAC, enter 


)PCOPY 34 ANOVA GNWAYANOVAC 
Note that the command 
)GRP GNWAYANOVAC 
suffices to list all members (functions and variables) of the group GNWAYANOVAC. 


Note that you enter the workspace name first, and then specify the object(s) to be copied from that 
workspace. 


In order to save the material contained in the active workspace as a workspace in your own private 
library, enter the )SAVE command and specify an appropriate name for the workspace. For example, 
if you enter 


)SAVE MYSTUFF 


the system saves a copy of the active workspace in your private library under the name MYSTUFF. 
You can access this workspace later by typing 


)LOAD MYSTUFF 


Subsequent saving of a workspace of this name overwrites the previous version of MYSTUFF and 
replaces it with a new version. 


Workspaces: 


LIBRARY 31 


GENERAL STATISTICAL FUNCTIONS 


CTAB 

DSTAT 

ENTRY 
FREQUENCY 
MISC 
NONPARAMETRIC 
PARAMETRIC 


TTEST 


1 


31 CTAB 


FUNCTIONS 


Function 
Header Documentation Description 


EXP CHI OBS CHIHOW Calculates chi-square, degrees 
of freedom and probability for 
expected and observed contin- 
gency tables. 


Z«CHIA X CHISQHOW Gives chi-square and degrees of 
freedom for a contingency table 
bigger than 2x2. 


Z«CHI2 X CHISQHOW Gives chi-square and degrees of 
freedom for a 2x2 contingency 
table. 


CHISQ CHISQHOW Computes chi-square, degrees 
of freedom, phi coefficient, 
Cramer's statistic, and contin- 
gency coefficient for a contin- 
gency table. 


CTAB DATA CTABHOW Performs 2- or 3-way сгоѕѕ- 
tabulations. 


Z-FISHER X CHISQHOW Calculates Fisher exact 
probability for a 2x2 table. 


GAMMA САММАНОМ Computes the Goodman- 
Kruskal gamma coefficient for 
a contingency table. 


CHI 
Syntax 
EXP CHI OBS 
Description 


CHI calculates and displays chi-square, its degrees of freedom, and its probability. The last figure is 
generally accurate to at least three decimal places. 


И required, CHI displays expected values calculated from observed values of a single sample or any 
sized contingency table. CHI automatically conflates classes, if necessary, to eliminate expected values 
of less than 1, and to reduce to 20 per cent or less the proportion of expected values less than 5. 
Yates’ correction for continuity is automatically applied when there is a single degree of freedom. 
Arguments 

EXP- Must be a vector or matrix of expected values, or 0 if expected values are to be calculated 


by the function itself. 
OBS- Must be a matrix or vector of observed values. 


Output 

1) Expected values (table) 
2) Chi-square 

3) Degrees of freedom 

4) Probability 


Example: One sample 


Taken from Siegel, page 45. See Reference. 


EXP 
18 18 18 18 18 18 18 18 
OBS 
29 19 18 25 17 10 15 11 
EXP CHI OBS 
EXPECTED VALUES 18 18 18 18 18 18 18 16 
CHI-SQUARE 16.33333333 
DEGREES OF ЕКЕЕРОМ 7 
PROBABILITY 0.02223970421 


Example: k samples 


Taken from Siegel, page 177. See Reference. 


EXP 
7.8 30.3 38 5.4 
18.6 77.5 97.1 13.8 
9. 38.2 47.8 6.8 
OBS 
23 40 16 2 
11 75 107 14 
1 31 60 10 
EXP CHI OBS 
EXPECTED VALUES 
7.3 30.3 38 5.4 
18.6 77.5 97.1 13.8 
9.1 38.2 47.9 6.8 
CHI-SQUARE 69.07632536 
DEGREES OF FREEDOM 6 
PROBABILITY 6.323684775E 13 
Reference 


Siegel, S. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 6. 
Source 


Dr. G.H. McLaughlin 

Newhouse Communications Center 
Syracuse University 

Syracuse, N.Y. 


(Modified by F. Arthur, LP. Sharp Associates Limited) 


Syntax 
CHISQ 


Description 


CHISQ computes and prints the chi-square value for any contingency table, along with degrees of 
freedom, the phi coefficient, Cramer's statistic and the contingency coefficient. 


Input 


A contingency table is requested conversationally and must be an NxM matrix. 


Output 


Тһе output for chi-square and modified chi-square consists of the chi-square value, the degrees of 
freedom, the phi coefficient, Cramer's statistic, and the contingency coefficient. For the Fisher exact 
probability test, the output consists of the Fisher exact probability of a result at least this extreme, 
and the Tocher probability of a result more extreme than this. If the exact probability of this outcome 


is desired, it is the difference 
Example 
Taken from Siegel; page 198. 
DATA 
23 40 16 2 


11 75 107 14 
1 31 60 10 


CHISQ 
ENTER CONTINGENCY TABLE 
DATA 
EXPECTED FREQUENCIES 
7.269230769 30 
18.57592308 77 
9.15386615u 38 


CHI SQUARE 
DEGREES 0 
COEFFICIENT. 


FREEDO! 


CHISQ 


between the two. 


See Reference. 


- 32307692 38.00769231 
.49230769 97.13076923 
-18461538 47.86153856 


69.38932827 
6 
0.4218072481 
0.2982627655 
0.3886475108 


Notes and Hints 


1) 


3) 


4) 


5) 


The subfunction CHI1, with the syntax ZeCHI1 X, computes ап ordinary chi-square for a contin- 
gency table larger than 2x2, where X is the contingency table and Z is a vector of length two, 
with the first element being the chi-square and the second the degrees of freedom. 


The subfunction CHI2, with the syntax 2+CHI2 X, computes chi-square as modified for 2x2 
contingency tables and corrected for continuity, where X and Z are the same as in CHI1. 


The subfunction FISHER, with the syntax ZeFISHER X, computes the Fisher exact probability 
for a 2x2 contingency table, where X is the contingency table and Z is again a vector of length 
two, whose elements are the Fisher exact probability of a result at least this extreme (one tailed), 
and the Tocher modification which gives the probability of a more extreme result. The difference 
between these two is the exact probability of this particular event. 


The program determines whether to use regular chi-square, modified chi-square or the Fisher 
exact probability. 


The separate functions CHI1, CHI2, and FISHER do no checking of data, and it is your 
responsibility to make sure that the data meets the requirements for the particular test being used. 


Reference 


Siegel, 5. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 6. 


Source 


Dr. K.W. Smillie 

Department of Computing Science 
University of Alberta 

Edmonton, Alberta 


CTAB 


Syntax 


CTAB DATA 


Description 


СТАВ will perform 2-way or 3-way crosstabulations, giving raw frequencies and column or row 
percentages. A facility for grouping the data is available. Chi-square values and degrees of freedom 
are given for each contingency table. 


Argument 


DATA - А matrix of data where the columns are variables and the rows are observations (or in- 


dividuals). Variables are referred to by column number. 


Input 


1) 


2) 


3) 


When asked for grouping information, four nurnbers are required, with at least one space between 
them: the variable number (meaning the column number in DATA), the left-hand end point, width, 
and number of classes desired for the classification. (Note: the syntax is similar to that of 
FREQ in the workspace 31 FREQUENCY.) 

When asked for tables desired, the following syntax applies: 

1 78 2 results in a table with V1 in rows and V2 in columns. 


175 3 79 5 results in a series of tables with V3 as rows and V5 as columns; there 
will be one such matrix for each group or value in V1. 


То obtain row and/or column percentage tables for a particular classification, use the following 
syntax: 


ROW AND COLUMN 1 VS 2 results in tables showing both row and column percentages, as well 
as raw frequency tables, 


COLUMN 1 VS 2 just column percentages (and raw frequencies). 
ROW 1 VS 2 just row percentages (and raw frequencies). 


In both the above cases, use the word STOP 10 signal end of information entry. 
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Output 


1) Mean and standard deviation for each variable. 

2) Tables showing in each cell the frequency of the characteristic in that column and row occurring 
in the same individual, marginal totals, and the chi-square value for that contingency table. 

3) Following the raw frequency observation table, a table showing row and/or column percentages 
is shown, if requested. 

4) Key to group classification if grouping was requested. 

Example 


(See next page.) 


Notes and Hints 


1) 


3) 


4) 


5) 


If a variable is grouped, GN will appear at the appropriate point in the table instead of УЛ, and 
this means that the numbers refer to classes, not actual values (where 0 is for all values outside 
class ranges). If a variable is ungrouped, actual data values appear on the table. 


After all tables requested are printed, you are asked if any more are required. 


You may, if desired, have blanks in the output instead of zeros, by typing NZERO before using 
the function (type ZERO for reinstating zeros). If you do not specify otherwise, printing zeros 
is defaulted. 


You should not make redundant requests, as these will not be eliminated by the system. For 
example, requesting 1 VS 2 and COLUMN 1 VS 2 will result in the printing of the raw frequen- 
cies for variable 1 VS variable 2 twice. 


The workspace 31 CROSSTAB contains a comprehensive crosstabulation analysis facility. The 
package provides easy construction and editing of a private data base structured around attributes, 
labels, keys and inquiries. For additional information concerning this facility, see the manual 
entitled “CROSSTAB: A Crosstabulation Package in SHARP APL” available from your SHARP 
APL representative. 


Example 


The input data represents the age, year in university, grade and sex, respectively, for 21 students. 


РАТА 
20 2 80 0 
23 4 75 1 
19 2 68 0 
26 9 91 0 
17 2 73 0 
21 5 6% 1 
19 T7 1 
18 1 68 1 
19 з 59 0 
20 6 76 1 
21 4 4g 1 
18 2 70 0 
17 1 75 0 
19 1,672 -1 
20 2 78 0 
20 з 83 0 
21 4 65 0 
22 5 67 1 
26 8 7% 1 
19 2 65 0 
25 7 76 1 

СТАВ DATA 

DO YOU WISH ANY GEOUPINGS?(Y OR N) 


y 
ENTER VAR. NO. , LEFT HAND END PT OF THE 1'ST GROUP , 
GROUP WIDTH , NO. OF GROUPS. 
D: 

1,15,5,2 
ENTER VAR. NO. , LEFT HAND END PT OF THE 1157 GROUP , 
GROUP WIDTH , NO. OF GROUPS. 
D: 

3,60,5,5 
ENTER VAR. NO. , LEFT HAND END PT OF THE 1'ST GROUP , 
GROUP WIDTH , NO. OF GROUPS. 


D: 

STOP 
ENTER TABLES 
0: 

3 VS 4 
ENTER TABLES 
B: 

2У53 
ENTER TABLES 
0: 


STOP 


ХЕТ 0 1 2 3 4 5 TOTAL 

1 2 1 1 4 

2 2 2 1 1 6 

3 1 1 2 

4 1 1 1 3 

V 5 1 1 2 
2 6 1 1 
7 1 1 

8 1 1 

9 1 1 

TOTAL 3 1 6 4 5 2 21 


D. Ғ.- 40 CHI-SQUARE= 39.69583333 


Vu 


0 1 | TOTAL 


D. Р.= 5 CHI-SQUARE- 3.493636364 
DO YOU WISH ANY MORE TABLES?(Y OR М) 
Y 


ENTER TABLES 
0: 
COLUMN 1 VS 3 
ENTER TABLES 
D: 
STOP 


| 0 1 2 3 4 5 { TOTAL 


TOTAL | 3 1 6 4 5 2 | 21 


D. F.- 10 CHI-SQUARE- 10.73333333 


COLUMN PERCENTAGES 


DO YOU WISH ANY MORE TABLES?(Y OR П) 
N 


GROUPED VARIABLE NO. 1 

GROUP LEFT ENDPT RIGHT ENDPT 
0 жажж CONTAINS ELEMENTS OUTSIDE CLASS BOUNDARIES **** 
1 15.000 20.000 
2 20.000 25.000 

GROUPED VARIABLE NO. 3 


GROUP LEFT ENDPT RIGHT ENDPT 


0 жжжж CONTAINS ELEMENTS OUTSIDE CLASS BOUNDARIES wxxw 
1 60.000 65.000 
2 65.000 70.000 
3 70.000 75.000 
4 75.000 80.000 
5 80.000 85.000 
Reference 


Freund, J.E. Modern Elementary Statistics, Second Edition, Prentice-Hall, 1964, Chapter 12. 
Source 


L. Gibson, S. Maxwelt, S. Swaminathan 
Institute of Computer Science 

University of Guelph 

Guelph, Ontario 


GAMMA 
Syntax 
GAMMA 
Description 


GAMMA computes and prints the Goodman-Kruskal gamma coefficient, which gives the degree of 
relationship for a contingency table consisting of ordered classes. 


Input 
A contingency table is requested conversationally. 
Output 


The Goodman-Kruskal gamma coefficient is printed. No test of significance is known for this coeffi- 
cient, so none is printed. 


Example 

Taken from Goodman and Kruskal, page 752. See Reference. 
DATA 

102 35 68 34 


194 80 215 122 
110 30 168 223 


GAMMA 
ENTER FREQUENCY TABLE. 
0: 

РАТА 


GOODMAN-KRUSKAL GAMMA COEFFICIEWT..... 0.2986641718 
Notes and Hints 


The subfunction GAMMA3 has one argument, the table, and returns а scalar result, the gamma coeffi- 
cient. 


Reference 


Goodman, L.A. and М.Н. Kruskal. “Measures of Association for Cross Classifications”, Journal of 
the American Statistical Association, Vol. 49, 1954, pp. 732-764. 
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31 DSTAT 


FUNCTIONS 

Function 

Header Documentation Description 

DS DATA DSHOW Gives sample size, mean, variance and 
standard deviation for vectors or ma- 
trices. 

DSTAT DATA DSTATHOW Gives 11 different descriptive statistics 
for vectors or matrices. 

ReKURTOSIS X KURTOSISHOW Gives the kurtosis of a vector of obser- 
vations. 

ReMEAN X MEANHOW Gives mean of a vector of observations. 

ReMEANDEV X MEANDEVHOW Gives mean deviation of a vector of ob- 
servations. 

ReMEDIAN X MEDIANHOW Gives median of a vector of observa- 
tions. 

ReMODE X MODEHOW Gives mode of a vector of observations. 

Re-RANGE X RANGEHOW Gives the range of a vector of observa- 
tions. 

ReSKEWNESS X SKEWNESSHOW Gives the skewness of a vector of obser- 
vations. 

R+STDDEV X STDDEVHOW Gives standard deviation of a vector of 
observations. 

R+STDERR X STDERRHOW Gives standard error of a vector of ob- 
servations. 

ReVAR X VARHOW Gives variance of a vector of observa- 
tions. 


DS 
Syntax 
DS DATA 
Description 
DS will analyze the input data and return the following descriptive statistics for each variable entered: 
1) Sample size 
2) Mean 
3) Variance 
4) Standard deviation 
Arguments 
DATA - May be: 
1) A vector of observations, or 
2) A matrix with the columns being variates and the rows being observations. 


Output 


The output is in the form of a labelled table. If matrix input is received then column 1 of the output 
table corresponds to the variable in column 1 of the input matrix, and so on. 


Example - Matrix Input 


DATA 
28 17 5 18 
7 20 4 23 
21 16 6 25 
24 14 8 30 
13 22 15 1 
DS DATA 
Va v2 V3 Vu 
NO OBS 5 5 5 5 
MEAN 18.60000 17.80000 7.60000 19.40000 
VARIANCE 72.30000 10.20000 19.30000 124.30000 
STD DEV 8.50294 3.19374 4.39318 11.14899 


Example - Vector Input 


X 

23 4 2 10 5 8 9 12 3 2 
DS X 

NO OBS 11 

MEAN 5.45455 


VARIANCE 13.27273 
STD DEV 3.64318 


Notes and Hints 
1) DATA may be created using the program INPUT in the workspace 31 ENTRY. 
2) Тһе input matrix сап be created using the function AND: 

eg, DS A AND B AND C 


where А, B, and C may be either valid vector or matrix input and need not have similar numbers 
of observations. 


3) Type DSTATHOW for information concerning a comprehensive set of descriptive statistics. 
Reference 

Dixon, W.J. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, 1957. 
Source 

L. Gibson, S. Maxwell, S. Swaminathan 

Institute of Computer Science 


University of Guelph 
Guelph, Ontario 
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DSTAT 
Syntax 
DSTAT DATA 
Description 


DSTAT will analyze the input data and return the following descriptive statistics for each variable 
entered: 


1) Sample size 

2) Mean 

3) Variance 

4) Standard deviation 
5) Standard error of the mean 
6) Mean deviation 
7) Median 

8) Maximum 

9) Minimum 

10) Range 

11) Mode(s) (if any) 


Arguments 
DATA - May be: 
1) А vector of observations. 
2) A matrix with the columns being variates and the rows being observations. 


Output 


Тһе output is in the form of a labelled table. If matrix input is received then column 1 of the output 
table corresponds to the variable in column 1 of the input matrix, and so on. 


Example - Matrix Input 


DATA 
28 17 5 18 
7 20 4 23 
21 16 6. 25 
24 16 8 30 
13 22 15 1 


DSTAT DATA 


Vi V2 V. Vu 
NO OBS 5 5 5 5 
MEAN 18.60000 17.80000 7.60000 19.40000 
VARIANCE 72.30000 10.20000 19.30000 124 . 30000 
STD DEV 8.50294 3.19374 4.39318 11.148939 
STD ERR 3.80263 1.42829 1.96469 4.98598 
MEAN DEV 6.88000 2.56000 3.12000 7.92000 
MEDIAN 21 17 6 23 
MAXIMUM 28 22 15 30 
MINIMUM 7 14 4 1 
RANGE 21 8 11 29 


Example - Vector Input 


X 
23 4 2 10 5 8 9 12 3 2 

L'TAT X 
NO OBS 11 
MEAN 5.45455 
VARIANCE 13.27273 
STD DEV 3.64318 
STD ERR 1.09846 
MEAN DEV 3.12397 
MEDIAN 4 
MAXIMUM 12 
MINIMUM 2 
RANGE 10 
MODE(S) 2 


Notes and Hints 
1) DATA may be created using the program INPUT in workspace 31 ENTRY. 
2) The input matrix can be created using the function AND: 

e.g., DSTAT А AND B AND C 


where 4, B, and С may be either valid vector or matrix input and need not have an equal number 
of observations. 


3) The subfunction 5747 will give the first 10 descriptive statistics for vector input only. The syntax 
is STAT DATA. 


4) The mode is printed only where there is a value which occurs at least twice. 
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Methodology 


The method of calculation of the various statistics is described where required in the appropriate 
subfunction for that statistic. 


References 

Dixon, W.J. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, 1957. 
Steel, R.G. and J.H. Torrie. Principles and Procedures of Statistics, McGraw-Hill, 1960. 
Source 

L. Gibson, S. Maxwell, S. Swaminathan 

Institute of Computer Science 


University of Guelph 
Guelph, Ontario 
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KURTOSIS 
Syntax 
ReKURTOSIS X 
Description 
KURTOSIS calculates the kurtosis of a finite, discrete set of numbers. 
Argument 
X - a vector of numbers with more than 1 element. 
Result 
R - the kurtosis of X, defined to be 3+ (ЗТСМА ((XLI]-XBAR) +570) xu)sN 
Example 
X 
2352105891232 
ReKURTOSIS X 
а В 
1.469017957 
Source 
R. Hui 


LP. Sharp Associates Limited 
Calgary, Alberta 
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MEAN 
Syntax 
ReMEAN X 
Description 
MEAN computes the mean of a set of numbers. 
Argument 
X - A vector of numbers. 
Result 
R - The mean of X. 
Example 
x 
23 &. 2 10 5 8.9 12 3 2 
ReMEAN X 
R 
5.u54545455 
Source 
А. Whitney 


LP. Sharp Associates Limited 
Calgary, Alberta 


MEANDEV 
Syntax 
R«MEANDEV X 
Description 
MEANDEV computes the mean deviation of a set of numbers. 
Argument 
X - A vector of numbers. 
Result 
R - The mean deviation of X. 
Example 
X 
2 3 1, 2. 10 5 8 9 12 39.2 
В«МЕАМРЕ/ X 
R 
3.123966942 
Source 
A. Whitney 


LP. Sharp Associates Limited 
Calgary, Alberta 
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MEDIAN 
Syntax 
ReMEDIAN X 
Description 
MEDIAN computes the median of a set of numbers. 
Argument 
X - A vector of numbers 
Result 
R - The median of X. 
Example 
X 
25 3. щ 2-:10,.:5- Br 9 42 3.2 
ReMEDIAN X 
R 
+ 
Source 
А. Whitney 


LP. Sharp Associates Limited 
Calgary, Alberta 
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MODE 
Syntax 
ReMODE X 
Description 
MODE computes the mode of a set of numbers. 
Argument 
X - А vector of numbers. 
Result 
R - The mode of X. 
Example 
X 
2 3 4 2 10 5 8 9 12 3 2 
ReMODE X 
R 
2 
Notes and Hints 
When no mode is found (ie., no value occurs more than once), the result of MODE is the null vector. 
Source 
A. Whitney 


ІР. Sharp Associates Limited 
Calgary, Alberta 
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RANGE 

Syntax 
ReRANGE X 
Description 
RANGE calculates the range of a set of numbers. 
Argument 
X - А set of numbers. 
Result 
R - The range of X. 
Example 

X 
2342105891232 

ReRANGE X 

R 
10 
Source 
R. Hui 


I.P. Sharp Associates Limited 
Calgary, Alberta 
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SKEWNESS 
Syntax 
ReSKEWNESS X 
Description 


SKEWNESS calculates the skewness of a set of numbers. 


Argument 
X - A vector of numbers with more than 1 element. 
Result 
R - The skewness of X, defined to be (SIGMA ((XLI]-XBAR) *87D) *3)N 
Example 
P4 
2342105891232 
ReSKEWNESS X 
TRO m 
Source 
R. Hui 


LP. Sharp Associates Limited 
Calgary, Alberta 
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STDDEV 
Syntax 
ReSTDDEV X 
Description 


STDDEV computes the standard deviation of a set of numbers. 


Argument 
X - A vector of numbers. 
Result 
R - The standard deviation of X. 
Example 
X 
2 52 ue. AO” 5.18 9-12 3 2 
ReSTDDEV X 
R 


3.643175438 

Methodology 

The standard deviation is calculated as the square root of the variance. 
Source 

А. Whitney 


LP. Sharp Associates Limited 
Calgary, Alberta 
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STDERR 
Syntax 
ReSTDERR X 
Description 


STDERR computes the standard error of the mean of the variable X. 


Argument 

X - A vector of numbers. 

Result 

R - The standard error. 

Example 
X 

23 4 2 10 5 8 9 12 3 2 
ReSTDERR X 
R 


1.098458725 
Methodology 


The standard error of the mean is calculated as the standard deviation divided by the square root 
of the number of observations. 


Source 
A. Whitney 


LP. Sharp Associates Limited 
Calgary, Alberta 


VAR 
Syntax 
ReVAR X 
Description 
VAR computes the variance of a set of numbers. 
Argument 


X - A vector of numbers. 


Result 

R - The variance. 

Example 
X 

2-"3- 4p .2: 210758228: 59:12. ,3..2 
ReVAR X 
R 


13.27272727 
Methodology 


"The variance is calculated as the sum of the squares of the deviations from the mean, divided by N-1, 
where N is the number of observations. 


Reference 

Dixon, М.Ј. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, 1957. 
Source 

A. Whitney 


LP. Sharp Associates Limited 
Calgary, Alberta 
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Function 


Header 


NEW*CORRECT DATA 


DATA-INPUT 


FORM TAB DATA 


TABLE 


31 ENTRY 
FUNCTIONS 


Documentation 


CORRECTHOW 


INPUTHOW 


TABHOW 


TABLEHOW 
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Description 


Allows correction of errors in matrices 
ог vectors. 


Allows creation of vectors and matrices. 


Produces a formatted table with option- 
al titles, multiple-lined column head- 
ings, row labels, and a caption. 


Conversationally assists in the creation, 
correction and printing of a formatted 
table with optional titles, multiple-lined 
column headings, row labels, and a 
caption. 


CORRECT 

Syntax 
NEW-CORRECT DATA 
Description 
CORRECT allows a novice to easily modify a vector or matrix of data. 
Arguments 
DATA - The vector or matrix of data to be edited. 
Result 
NEW - The name to which the modified version of DATA is to be assigned. If you wish to overwrite 

your old version of DATA (i.e., not create a new variable) use the same name as the old name; 

for example: 

DATA*CORRECT DATA 

Example 

X 
29 73 61 90 52 99 20 48 92 67 31 96 72 4 

75 18 5 69 33 76 

X«CORRECT X 
DO YOU WISH TO SEE A LIST OF THE AVAILABLE OPTIONS? (Y OR N) 
X 
S FOR STOP THE PROGRAM 
V - CHANGE А VALUE 
I - INSERT А VALUE(S) 
D 
P 


- DELETE A VALUE 
- PRINT THE VECTOR 


ENTER THE OPTION CODE 
T 


ENTER THE VALUE PRIOR TO THE POINT OF INSERTION 


0: 
FIRST 
ENTER THE VALUE TO BE INSERTED 
D: 
10 
ENTER THE OPTION CODE 
V 
ENTER THE VALUE TO BE CHANGED 
0: 
52 
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ENTER THE NEW VALUE 


25 
ENTER THE OPTION CODE 


ENTER THE VALUE PRIOR TO THE POINT OF INSERTION 


18 


ENTER THE VALUE TO BE INSERTED 


91 
ENTER THE OPTION CODE 
D 


D 

75 
ENTER THE OPTION CODE 
Т. 


10 29 73 61 90 25 
% 18 91 5 69 
ENTER THE OPTION CODE 
S 
x 
10 29 73 61 90 25 
4 18 91 5 69 


Notes and Hints 


1) The program is conversational. 


2) Тһе available options are: 


S - Stop program 
- Change a value 


TONS 


ENTER THE VALUE TO BE DELETED 


20 
76 


R - Row insertions or deletions 


С - Column insertions or deletions 


3) When using the insert option, type the word FIRST when you wish to make an insertion at the 
beginning of a vector, or matrix row or column. Otherwise, type the value after which the 


insertion is to be made. 


4) When using the matrix row or column insert facility, the following options are available to you 


48 92 67 31 96 


48 92 67 31 96 


- Make an insertion (and the appropriate deletion) 
- Make a deletion (and the appropriate insertion) 
- Print the data being examined 


on input of your new row or column: 


a) STOP - to stop input of the data 


b) EDIT - to edit the data 


c) DISPLAY - to display the data already entered in your row/column 


5) Remember, when you request deletion of a row or a column of a matrix, all other rows or columns 
after that will be renumbered. Display the matrix again if you lose track. 
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72 


72 


6) It is crucial to assign the result of the program to some variable name (as shown in Syntax), 
otherwise the corrected array will disappear at the end of the execution of the program 
CORRECT. 


7) Ға vector or a row of a matrix is too wide to print on one line, the system will display it by 
printing the rest of the numbers on the next line, indented six spaces from the original left margin. 


8) This program is normally terminated by use of the STOP option. Only when the program is 
normally terminated will the data be saved. 


Restrictions 

The program provides for revision of up to 2-dimensional arrays (i.e. matrices and vectors only). 
Source 

L. Gibson, S. Maxwell, S. Swaminathan 

Institute of Computer Science 


University of Guelph 
Guelph, Ontario 


42 


INPUT 
Syntax 
DATACINPUT 
Description 


INPUT simplifies matrix and vector creation and allows insertion, deletion and/or changing of data 
elements. 


Input 


Creation of Vector - The vector length is requested and then the data necessary to fill the vector must 
be supplied. 


Creation of Matrix - The number of rows and columns to be in your matrix are requested. The data 
for the matrix may then be entered (either by row or by column). 


Result 
DATA - Contains the finished vector or matrix. 
Example 


DATA-INPUT 
DO YOU WISH TO FORM A VECTOR OR MATRIX?(V OR M) 
M 
NO. OF ROWS IN YOUR MATRIX? 
0: 
5 
NO. OF COLUMNS? 
D: 
3 
DO YOU WISH TO ENTER DATA BY COLUMN OR BY ROW?(C OR В) 
C 
ENTER COLUMN 1 
0: 
77 61 31 51 47 
ENTER COLUMN 2 
D: 
198 135 60 
CONTINUE 
П: 
105 92 
ENTER COLUMN 3 
D: 
796 2347 1053 41 113 
DO YOU WISH TO EDIT OR VIEW YOUR DATA NOW?(Y OR N) 
Jd 
DO YOU WISH TO SEE A LIST OF THE AVAILABLE OPTIONS?(Y OR N) 
X 
S FOR STOP PROGRAM 
V - CHANGE А VALUE 
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"y Bg Qty ty она 


- МАКЕ AN INSERTION (AND THE APPROPRIATE DELETION) 
- MAKE A DELETION (AND THE APPROPRIATE INSERTION) 
- PRINT THE DATA BEING EXAMINED 

- ROW INSERTIONS OR DELETIONS 

- COLUMN INSERTIONS OR DELETIONS 


NTER OPTION CODE 


77198 796 
61135 2347 
31 60 1053 
51105 41 
47 92 113 


ENTER OPTION CODE 


5 


РАТА 
77198 796 
61 135 2347 
31 60 1053 
51105 41 
47 92 113 


Notes and Hints 


D 
2) 


3) 


4) 


6) 


The program INPUT is highly conversational. 
At any time at which data is being requested, the following key words may be typed instead: 


a) STOP - to stop the program 
b) EDIT - to edit the data (check documentation for the CORRECT program in this workspace) 
c) DISPLAY - to display the data 


On entering data, if the word CONTINUE is typed for you, it is a signal that more data is required 
to fill the column, row or vector being entered. 


The editing capabilities are equivalent to those of the CORRECT program in this workspace. See 
CORRECTHOW for further details. You can use these capabilities by answering YES 10 the question 
'DO YOU WISH TO EDIT OR VIEW YOUR DATA NOW?'. 


If you already have separate variables stored in vector form and now wish to create a matrix 
of data from them, you may use the variable names (rather than the raw data) when asked to 
enter the row or column of data to which the name applies. 


For instance, if the variable Z contains the data 12 13 17 19 42 which you wish in the second 
row of your matrix, then you could type the name Z rather than the data itself when asked to 
enter that row. 


It is crucial to assign the result to some variable name as shown in the Syntax, and to terminate 


normally. Otherwise, the array will be created, but will disappear upon completion of the 
execution of the program. 
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Restrictions 
1) The program allows creation of up to 2-dimensional arrays only (і.е., vectors and matrices). 


2) Larger matrices may be created if the data is kept in integer form while the matrix or vector 
is being entered. Then apply a correction factor to convert it to the correct decimal form. 


Source 


L. Gibson, S. Maxwell, S. Swaminathan 
Institute of Computer Science 

University of Guelph 

Guelph, Ontario 
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TAB 
Syntax 
FORM TAB DATA 
Description 
TAB enables the user to set up a formatted table with optional: 
1) Multi-lined titles 
2) Multi-lined column headings 
3) Single line row labels 
4) Caption 
as well as allowing a choice of the decimal point accuracy for the numbers in the table. 
Arguments 
FORM - А character string (which must be enclosed in quotes), made up of the following: 
1) An optional string of title lines, each of which must be preceded by a n, followed by: 
2) An optional string of column headings, each of which must be preceded by a c. If more 
than one-line column headings are required, each line must be separated from the next 
by a |. This is followed by: 
3) An optional string of row labels, each of which must be preceded by a 2. Following this: 
4) An optional caption, preceded by a u. 
DATA - The array (of rank «2 and any dimension) which makes up the body of the table. 
Input 
Тһе only input is the decimal point accuracy desired. 
Output 


The formatted table is printed after allowing the user to adjust the paper. 
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Example 


FORMAT 
nORDER FORMCPART |NO . cUNIT | PRICECQTY .cPRICEDNUTS>BOLTS>SCREWSURUSH 
DELIVERY 
X 
76 0.05 8 0.4 
142 0.1 4 0.4 
37 0.06 2 0.12 


FORMAT TAB X 


TYPE IN THE DECIMAL POINT ACCURACY WANTED. 


0: 
2 
ADJUST YOUR PAPER AND THEN PRESS THE RETURN KEY. 
ORDER FORM 

PART UNIT 

NO PRICE QTY PRICE 
NUTS | 76.00 0.05 8.00 0.50 | 
BOLTS | 142.00 0.10 4.00 0.50 | 
SCREWS | 37.00 0.06 2.00 0.12 | 


RUSH DELIVERY 


Notes and Hints 


1) 


Entering just ' ' ТАВ DATA will result in the tabulating of the data, without annotation, all with 
the same decimal point accuracy. 


2) The output cannot be stored. 

3) Because the symbols n со u | are used as delimiters they must not be used in the headings 
themselves. 

4) The related function TABLE assists the user in setting up all the different headings, and allows 
for optional marginal totals. (See documentation for TABLE in this workspace.) 

5) If the table is too wide for the present width setting it will not be printed. 

Source 


Dr. G.H. McLaughlin 

Newhouse Communications Center 
Syracuse University 

Syracuse, N.Y. 


TABLE 

Syntax 
TABLE 
Description 
TABLE assists the user in the construction of a table with optional: 

1) Multi-lined titles 

2) Multi-lined column headings 

3) I-lined row labels 

4) Caption 
as well as allowing for desired decimal point accuracy. 
Input 
The user is first asked whether assistance is required in setting up the array which is to be tabled. 
If so, the subfunction INPUT is used (see INPUTHOW in this workspace); otherwise the data array 
(either variable name or the numbers themselves) is requested. A vector or matrix is expected. 
Incorporation or exclusion of the following options must then be indicated: 

- marginal totals 

- a title 

- the desired decimal point accuracy 

- column headings. If these are to be included, an opportunity is provided to display the headings 

and correct them where necessary. 
- row labels 
- a caption at the bottom of the table. 


Output 


The formatted table with all headings. 


Example 

x 
5.428571429 6.285714286 4.285714286 3.142857143 
2.714285714 2.571428571 0.4285714286 1.428571429 
3 6.714285714 2 1.714285714 
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TABLE 
DO YOU WISH ANY ASSISTANCE IN SETTING UP YOUR DATA? 
N 
ENTER YOUR DATA (EITHER VARIABLE NAME OR NUMBERS). 
D: 

X 


DO YOU WISH MARGINAL TOTALS? 
Y 
DO YOU WISH A TITLE ON YOUR TABLE? 
YES 
ENTER YOUR TITLE SEPARATING THE LINES WITH THE SYMBOL а 
THIS IS MERELY n A TEST CASE 
TYPE IN THE DECIMAL POINT ACCURACY WANTED FOR YOUR TABLE. 
a: 
2 


DO YOU WISH ANY COLUMN HEADINGS? 
YES 
HOW MANY SPACES DO YOU WISH BETWEEN BACH OF YOUR COLUMN HEADINGS? 
0: 
2 


ENTER YOUR COLUMN HEADINGS SEPARATING EACH LINE OF EACH COLUMN HEADING WITH 
А | AND SEPARATING EACH COLUMN HEADING FROM THE NEXT WITH А c. 

COLUMN 1cTWO-LINE|HEADINGcCOLUMN 3cCOL 4 

DO YOU WISH TO SEE WHAT YOUR COLUMN HEADINGS WILL LOOK LIKE? 

Y 


TWO-LINE 
COLUMN 1 HEADING COLUMN 3 COL 4 
DO YOU WISH TO RETYPE YOUR COLUMN HEADINGS? 
N 
DO YOU WISH ANY ROW LABELS? 
X 


ENTER YOUR ROW LABELS SEPARATING EACH LABEL FROM THE NEXT WITH THE SYMBOL >. 
ROWi2SECOND ROW>ROW З 

DO YOU WISH A CAPTION AT THE END OF YOUR TABLE? 

y 

ENTER YOUR CAPTION. 

CAPTION 

ADJUST YOUR PAPER AND THEN PRESS THE RETURN KEY. 


THIS IS MERELY 


A TEST CASE 
TWO-LINE 
COLUMN 1 HEADING COLUMN 3 COL 4 TOTALS 
ROW. | 5.43 6.29 4.29 3.14 | 19.14 | 
SECOND ROW | 2.71 2.57 0.43 1.53 7.1% | 


ROW 3 | 3.00 6.71 2.00 1.71 | 13.43 | 
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Notes and Hints 
1) Тһе output cannot be saved. 


2) Because the symbols n c > u | are used as delimiters, they must not be used in the headings 
themselves. 


3) Тһе data that was entered or created at the beginning of the program is stored after the program 
is finished and can be used for other purposes (it has the name DATA). 


4) И the table, as defined, is too wide for the present width setting, it will not be printed. 
Source 

Dr. G.H. McLaughlin 

Newhouse Communications Center 


Syracuse University 
Syracuse, N.Y. 


31 FREQUENCY 


FUNCTIONS 

Function 

Header Documentation Description 

FD-MP DIST Е DISTHOM Converts a vector of class frequencies 
back into raw scores. 

E-B ENORMFIT F ENORMFITHOW Serves basically the same purpose as 
NORMFIT, but uses classes with equal 
probabilities. 

ReP FR DATA FRHOW Performs non-conversational frequency 
classification. 

FREQ DATA FREQHOW Creates histograms and frequency 

FREQ BINOMIAL DATA tables and does normal, Poisson ог bi- 
nomial fits (conversational). 

NeR NEGBIN Е NEGBINHOW For given frequency distribution, 
provides descriptive statistics, the par- 
ameters for negative binomial distribu- 
tion, and resultant expected distribu- 
tions. 

ЕХРєС NORMFIT OB NORMPITHOW For given frequency | distribution 
provides descriptive statistics, calculates 
resultant normal distribution, and tests 
for goodness of fit. 

ReNTILES DATA NTILESHOW Calculates п-Шеѕ (median, quartiles, 


deciles, etc.) of a vector of frequencies. 
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DIST 
Syntax 
FD-MP DIST Е 
Description 


DIST converts a vector of class frequencies back into raw scores, assuming that the values of each 
class were originally concentrated at the midpoint. 


Argument 

MP - Kither a vector of midpoints, or a vector of two values: 
MP[1] - lowest midpoint 
МР[2] - class width 

F - A vector of frequencies ordered by increasing class size. 


Result 


FD - А vector in which each class midpoint is repeated as many times as indicated by the frequency 
of its class. 


Example 

FD+2 1 DIST 24632 

FD 
2: 23 3$ 3.3 4 ip dp gk % 55 5 66 
Notes and Hints 


1) Where there are only two frequency classes, use the second input option for MP (i.e., the lowest 
midpoint and the class width). 


2) For the inverse of this function see FR or FREQ in this workspace. 
Source 

Dr. G.H. McLaughlin 

Newhouse Communications Center 


Syracuse University 
Syracuse, N.Y. 


ENORMFIT 
Syntax 
Е<В ENORMFIT F 
Description 


ENORMFIT serves the same purpose as NORMFIT, but uses classes with equal probabilities, as recom- 
mended by Kendall and Stuart (see Reference). The Mann-Wald procedure is used to determine the 
number of classes required. For a given frequency distribution, EVORMFIT outputs descriptive statistics. 
It then calculates the resultant normal distribution, the mean and variance of which are estimated from 
the given distribution. The observed distribution is then tested for goodness of fit to the normal 
distribution by the chi-square test. If necessary to meet the requirements of the test, the frequency 
classes are conflated to give a total of № classes. N-3 degrees of freedom аге used because the two 
parameters of the normal distribution are estimated from the observations. 


Arguments 

В - Either a vector of lower class bounds in increasing order, followed by the highest class bound, 
or a vector of class midpoints in increasing order. 

F - A vector of frequencies, one for each class defined by В. 

Result 

E - A matrix of 6 columns: 

ЕГ;11 Lower class bounds 

EL ;2] Upper class bounds 

E(;3] Class midpoints 

Е 541 Observed frequencies 

ЕГ;51 Expected frequencies (all identical) 

E[;6] Differences between E[ ;4] and E[;5]. 

Output 

In addition to returning the result, ЕЛОЯМРТТ prints out these items fully labelled: 


Mean, S.E., S.D. and number of values of F; expected values, calculated chi-square, its degrees of 
freedom, and its probability, generally accurate to at least three figures. 
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Example 


Ее 2 10 1 2 ENORMFIT 5 10 20 10 5 


MEAN 0 
STANDARD ERROR 0.1564921593 
STANDARD DEVIATION 1.10656667 
NUMBER ОР VALUES 50 
EXPECTED VALUES 
8.333333333 8.333333333 8.333333333 
8.333333333 8.333333333 8.333333333 
CHI-SQUARE VALUE 28 
DEGREES ОР FREEDOM 3 
PROBABILITY 3.65087356Е 5 


Notes and Hints 


ALPHA is the required significance level for the Mann-Wald procedure. ALPHA is automatically set 
to 0.05. 


Reference 


Kendall, M.G., and A. Stuart. The Advanced Theory of Statistics, Vol. 2, Second Edition, Hafner 
Press, New York, 1967. 


Source 
Dr. G.H. McLaughlin 
Newhouse Communications Center 


Syracuse University 
Syracuse, N.Y. 
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FR 
Syntax 
ReP FR DATA 
Description 


FR produces observed frequencies, В, for a set(s) of observations, DATA, according to a set of classes 
designated by P. 


Arguments 
DATA - May be: 
1) A vector of observations, or 
2) A matrix with columns being variates and the rows being observations. 


Note: РАТА can be created using the program AND as shown in Notes and Hints below. 


Р - Three numbers with the first number being the left-hand end point of the first frequency class, 
the second number being class width, and the third number being the number of classes. 


Note: the number of classes must be an integer value. 
Result 


В - A matrix in which the first column represents the class midpoints and successive columns аге 
observed frequencies for the corresponding variates of the input. 


Note: The first and last rows of the table represent the observed frequencies for the left and right 


tails respectively. The first columns of these two rows contain what would be the midpoints 
of the next classes at the tails. 
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Example 
Taken from Freund, page 72. See Reference. 


DATA 
52 54 49 60 55 40 58 55 51 57 45 60 56 52 54 

59 53 65 56 44 47 35 46 56 47 53 48 61 
54 48 47 52 51 42 51 55 53 65 55 53 650 
53 56 61 51 57 50 47 63 30 44 58 46 53 
47 62 56 45 48 54 S4 58 43 55 52 46 53 
63 53 42 52 55 46 60 56 50 51 57 53 58 
59 47 57 55 58 57 54 49 53 56 54 61 58 
u5 54 47 49 52 54 49 63 53 45 51 60 62 
44 59 47 49 59 51 38 48 54 59 53 46 51 
61 67 52 53 56 44 53 52 54 54 46 56 50 
52 56 52 66 58 52 56 55 57 61 28 37 49 
57 50 59 38 56 49 48 55 50 57 53 44 57 
64 50 55 54 42 53 57 87 55 46 41 60 58 
50 51 46 53 49 55 48 68 47 48 53 55 51 
62 45 50 56 49 55 43 54 52 53 83 7% 64 


RESULT+25 5 10 FR DATA 


RESULT 
22.5 0 
27.5 1 
32.5 1 
37.5 4 
42.5 13 
47.5 40 
52.5 65 
57.5 52 
62.5 18 
67.5 5 
72.5 1 
77.5 o 


Notes and Hints 


1) The program INPUT in the workspace 31 ENTRY can be used to create data for input to FR by 
using the following syntax: 


P FR DATA-INPUT 
DATA will then contain the data. 


2) In order to obtain a histogram of the frequencies, execute the following statements: 
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DSP+P FR DATA 
)LOAD 3 PLOT 
MULTIPLE 
HISTOGRAM 
ABSCISSACOL 1 
PLOT DSP 


3) The function AND can be used to create the data matrix by using the following syntax: 
DATA-X AND Y AND Z 


where X, Y, Z are acceptable forms of vector or matrix data and need not necessarily be 
conformable. 


4) Refer to the function FREQ in this workspace for a program which is more conversational and 
has options to perform fits and generate histograms. FREQ can accept binomial data as input. 


Methodology 
Frequency classifications are performed so that an observation falls into a class if and only if 
Left Hand End Point < Observation « Right Hand End Point 
of the class of the class 
Reference 
Freund, J.E. Modern Elementary Statistics, Prentice-Hall, Englewood Cliffs, N.J., 1965. 
Source 
L. Gibson, S. Maxwell, S. Swaminathan 
Institute of Computer Science 


University of Guelph 
Guelph, Ontario 
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FREQ 
Syntax 
FREQ DATA 
Description 
FREQ will produce class frequencies for the observations in DATA classified according to a specified 
set of classes. FREQ has options which include fitting of binomial, Poisson or normal curves to the 
data, producing a plot of observed and expected values, and producing labelled tabular output. 
Argument 
DATA - May be: 
1) A vector of observations, or 
2) A matrix with the rows being observations and the columns being variates. 
Note: Тһе AND function may be used 10 produce this matrix. 
Input 
The program asks for the left-hand end of the first frequency class and the number of classes. 
Output 
Тһе tabular output includes: 
1) Тһе observed frequencies for each class, and these frequencies as a percentage of total number 
of observations. 
2) Expected frequencies and perceniages of total for each class if a fit is performed. 
3) The class end points and midpoints. 
4) A grand total observation count. 
5) A chi-square value for each variate. 
The plot output gives observed frequencies (and expected frequencies if a fit has been performed) 
plotted against the class midpoints (O denotes observed, while & denotes expected). The first and last 
rows of the tabular output represent the tails with their associated observed and expected frequencies. 


If a matrix is entered, each column will be treated as a separate variable but classified in the same 
way. Separate columns of output will represent separate variables. 
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Examples 


DATA 
56 
57 
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FREQ DATA 
ENTER THE FOLLOWING DATA. 
LEFT HAND END OF THE FIRST FREQUENCY CLASS;YOUR DATA MIN=4 
D: 

0 


CLASS WIDTH AND THE NUMBER OF CLASSES;YOUR DATA MAX=100 
D: 
10 10 


DO YOU WISH A FIT DONE ON YOUR DATA? Y OR N 

Сана OR POISSON ? N OR P 

May DATA MEAN =55.08 AND STANDARD DEV. =30.48910143 
(2) DATA MEAN =49.6 AND STANDARD DEV. =5.756983337 

DO YOU WISH A HISTOGRAM ? Y OR N 

А THE COLUMN NOS. OF THE VARIATES TO BE PLOTTED 

п 1 

DO YOU WISH TABULAR OUTPUT ? Y ORN 


Y 
-ENDPOINTS- 
L В MID i. 220 X 72 
OBS o/o EXP о/о QBS 9/% EXP 
(LEFT TAIL) 0 0.0 1.759 3.5 0 0.0 0.000 
0 10 5.0 3 6.0 1.719 3.4 0 0.0 0.000 
10 20 15.0 5 10.0 2.780 5.6 0 0.0 0.000 
20 30 25.0 6 12.0 4.014 8.0 о 0.0 0.024 
30 40 35.0 3 6.0 5.238 10.5 0 0.0 2.351 
40 50 45.0 6 12.0 6.191 12.4 29 58.0 23.999 
50 60 55.0 4 8.0 6.495 13.0 20 40.0 21.866 
60 70 65.0 3 6.0 6.201 12.4 1 2.0 1.743 
70 80 75.0 5 10.0 5.256 10.5 0 0.0 0.016 
80 90 85.0 7 14.0 ч.034 8.1 о 0.0 0.000 
90 100 95.0 7 14.0 2.799 5.6 0 0.0 0.000 
(RIGHT TAIL) 1 2.0 3.514 7.0 0 0.0 0.000 
TOTAL OBSERVATIONS 50 50 
CHI-SQUARE 19.3394 3.9100 
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o 
Se l 
В 


oOoo0o0ouoroooo 
оооооаоаоооо 


OBSERVED- о : EXPECTED- e : VARIABLE (COLUMN NO.)-1 


7.5 
oo 
е оо 
о еее оо 
о Фе ФОО 
5.0 оо ee ооо 
оо еееосо 
oe €0e0e0 
oe есесесе 
oeeoeoooeee 
2:5 oeeoeoooeee 
есеесеоосе e 
еееесеооовевеә 
езгеесеоосеео 
еееесеососеео 
ool | of | 1 | | | | 
50 0 50 100 150 
E 
10100111011001100001111101011 
11110110010000001101110100 
01110011111110001110100100 
11100100000011000110100001 
00100010111000001001001000 
0:110101111011011 


FREQ BINOMIAL I 
ENTER THE TRIALS/EXPERIMENT AND NUMBER OF EXPERIMENTS 
D: 
4 50 
ENTER THE FOLLOWING DATA. 
LEFT HAND END OF THE FIRST FREQUENCY CLASS;YOUR DATA MIN=0 
B: 
9 
CLASS WIDTH AND THE NUMBER ОР CLASSES;YOUR DATA MAX=4 
0: 
1:5 


DO YOU WISH A FIT DONE ON YOUR DATA? Y OR N 
Y 

DO YOU WISH A HISTOGRAM ? Y OR N 

Y 

DO YOU WISH TABULAR OUTPUT ? Y ORN 

Y 
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овфоов 
onroro 
оввоо 
RRR 
ooooo 
оквов 
осовъ 
котва 
BRROO 
вовво 


-ENDPOINTS- 
L R MID 


0 1 0.5 6 12.0 3.125 6.3 
1 2 1.5 11 22.0 12.500 25.0 
2 з 2.5 13 26.0 18.750 37.5 
3 4 3.5 17 34.0 12.500 25.0 
4 5 4.5 3 6.0 3.125 6.3 
TOTAL OBSERVATIONS 50 
CHI-SQUARE 6.2133 


OBSERVED- О : EXPECTED- е : VARIABLE (COLUMN NO.)-1 


20 
e 
e 
e о 
e o 
15 е о 
e o 
e o e 
e o e 
о о е 
10 о о е 
о о е 
о о е 
о о е 
о о о е 
5 о о о е 
о о о e 
e о о e е 
ө о о е е 
е о о е е 
0 1 | | | i І І І | | 
0° 1 2 3 4 5 


Notes and Hints 
1) Observations fall within a class if and only if 
Left Hand End Point « Observation « Right Hand End Point. 
2) To input binomial data use the following syntax: 
FREQ BINOMIAL DATA 


where DATA is to be entered one experiment at a time as a vector of raw binomial data. 
BINOMIAL classifies the data by experiment. 
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3) То obtain only the frequency analysis without fits, plots, or conversation, refer to the documenta- 
tion for FR in this workspace. И space is a prime consideration, use of FR may be helpful as 
well. 


4) When observed and expected frequencies are equal, the symbol 9 only will appear on the bar 
of the histogram. 


5) Тһе argument DATA can be created using the syntax 
DATA*INPUT 
or 
РАТА+Х AND Y AND Z 
The function ГУРИТ is in the workspace 31 ENTRY. 
6) A DOMAIN ERROR will result if numbers which are too large are used in the Poisson fit. 


Assumptions 


Remember that the chi-square goodness of fit statistic is valid only when there are no classes with 
an observed or expected frequency of less than 5. 


"Technical Notes 
Normal approximation taken from Smillie. See Reference. 
Reference 


Smillie, K.W. Statpack 2: Ап APL Statistical Package, Publication No. 17, Department of Compu- 
ting Science, University of Alberta, 1969. 


Source 
L. Gibson, S. Maxwell, S. Swaminathan 
Institute of Computer Science 


University of Guelph 
Guelph, Ontario 
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NEGBIN 


Syntax 


NeR NEGBIN Е 


Description 


For a given frequency distribution, VEGBIN outputs descriptive statistics. From these it estimates the 
parameters P and K for a negative binomial distribution. It then calculates the resultant expected 
distribution which is tested for goodness of fit to the observed distribution by the chi-square test. If 
necessary to meet the requirements of the test, the frequency classes are conflated to give a total of 
N classes. N-3 degrees of freedom are used because P and K are estimated from the observations. 


Arguments 


R 


- Must be 0, if maximum likelihood estimators are to be calculated (warning: this calculation 
is highly iterative). Otherwise R must be (1рР)-1 (1.е., a vector of midpoints of classes with 
unit width), or a vector of lower class bounds, starting with -0.5. 


F - Must be a vector of frequencies. If R is 0 or (1рР)-1, then Е must give frequencies for 
0,1,2,3,.. events. If R consists of bounds, F must have a value for each class defined (і.е., 
pP must equal (р8)%1). 

Result 

N - А vector of expected frequencies having a negative binomial distribution with parameters P 
and K estimated from the distribution defined by the arguments. The frequencies, which total 
those of the given distribution, are grouped within the classes defined by 2. 

Output 


In addition to returning the result, VEGBIN prints out these items fully labelled: 


1) 
2) 


3) 


4) 


Mean, S.E., S.D. and number of values of F. 

Either moments method estimates of P and К or the message 

VARIANCE « MEAN. SO DISTRIBUTION IS NOT NEGATIVE BINOMIAL. (in which case the 
function branches out and the result is an empty vector). 

If Р=0, either maximum likelihood estimates of P and К are displayed, or the message 
MAXIMUM LIKELIHOOD ESTIMATES UNOBTAINABLE is printed. 

Calculated chi-square, its degrees of freedom, and its probability, generally accurate to at least 
three figures. 
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Example 


00 1 2 3 4 NEGBIN 20 23 15 10 9 


MEAN 1.545454545 
STANDARD ERROR 0.1508071863 
STANDARD DEVIATION 1.323327689 
NUMBER OF VALUES 77 
MOMENTS METHOD :Р 0.8825136612 

ESTIMATES :К 11.60887949 
EXPECTED VALUES 

18.04585351 24.61246438 18.23009657 
9.715775097 4.168902504 

CHI-SQUARE VALUE 6.49636591 
DEGREES OF FREEDOM 2 
PROBABILITY 0.1650195675 
Source 


Dr. G.H. McLaughlin 

Newhouse Communications Center 
Syracuse University 

Syracuse, N.Y. 
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NORMFIT 
Syntax 
EXP-C NORMFIT OB 
Description 
For a given frequency distribution, NORMFIT outputs descriptive statistics. It then calculates the 
resultant normal distribution, the mean and variance of which are estimated from the given distribu- 
tion. The observed distribution is then tested for goodness of fit to the normal distribution by the 
chi-square test. If necessary to meet the requirements of the test, the frequency classes are conflated 


to give a total of N classes. N-3 degrees of freedom are used because the two parameters of the normal 
distribution are estimated (гот the observations. 


Arguments 


C - Must be either a vector of lower class bounds, in increasing order, followed by the highest 
class bound, or a vector of class midpoints in increasing order. 

OB - Must be a vector of frequencies, one for each class defined by C. 

Result 


EXP- Is the vector of р0В normally distributed frequencies at either the upper class bounds or the 
class midpoints, whichever is defined by C. 


Output 
In addition to returning the result, VORMFIT prints out these items, fully labelled: 


mean, S.E., S.D. and number of values of OB, calculated chi-square, its degrees of freedom, and its 
probability, generally accurate to at least three figures. 


Example 

EXP- 2 1 0 1 2 NORMFIT 5 10 20 10 5 
MEAN 0 
STANDARD ERROR 0.1564921593 
STANDARD DEVIATION 1.10656667 
NUMBER OF VALUES 50 
EXPECTED VALUES 

1.767532948 7.386387193 15.84607986 

15.84607986 9.153920141 
CHI-SQUARE VALUE 11.96704202 
DEGREES OF FREEDOM 2 
PROBABILITY 0.01759803945 
Source 


Dr. G.H. McLaughlin 

Newhouse Communications Center 
Syracuse University 

Syracuse, N.Y. 
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NTILES 
Syntax 
R-NTILES DATA 
Description 
NTILES will give N-tiles (median, quartiles, deciles, etc.) of a vector of frequencies. 
Argument 


DATA - The vector of frequencies required as input. (This is in the form of the result produced by 
the function FR in this workspace.) 


Input 
The program conversationally requests: 
1) The left-hand end point of the first frequency class and the class width which was used to produce 
the frequencies in DATA (exactly as would be required in FR to produce this particular input). 
2) The number of N-tiles: enter 2 for median, 4 for quartiles, 10 for deciles, etc. 
Result 
R - The explicit result produced by NTZLES, which will be the median, quartiles, or deciles, etc. 
Example 
DATA 
4 15 77 97 21 91 100 34 40 79 95 16 hà 30 43 
41 75 9 28 59 50 49 14 81 67 96 27 98 
11 70 63 57 88 99 19 22 74 85 53 68 80 
46 20 93 7 24 36 87 89 83 


Rey 8 11 FR DATA 


ч 


E 
= 
з лп юо+ гот ко 
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T-NTILES R[;2] 
ENTER THE LEFT HAND END POINT OF THE FIRST FREQUENCY CLASS 
AND THE CLASS WIDTH 


D: 

08 
ENTER 2 FOR MEDIAN ; 4 FOR QUARTILES ; ETC. 
0: 

2 

T 


58.66666667 


T-NTILES R[;2] 
ENTER THE LEFT HAND END POINT OF THE FIRST FREQUENCY CLASS 
AND THE CLASS WIDTH 


D: 

08 
ENTER 2 FOR MEDIAN ; u FOR QUARTILES ; ETC. 
0: 

4 

T 


31.2 58.66666667 87.2 
Restrictions 

NTILES will only handle vector input. 
Source 

L. Gibson, S. Maxwell, S. Swaminathan 
Institute of Computer Science 


University of Guelph 
Guelph, Ontario 
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Function 
Header 


ReM COMB N 


RePERMUTE V 


ReN ROUND DATA 


ReSMOOTH DATA 


31 MISC 
FUNCTIONS 


Documentation 


COMBHOW 


PERMUTEHOW 


ROUNDHOW 


SMOOTHHOW 
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Description 


Generates all combinations of size N 
from the set of objects 1M. 


Produces a permutation obtained from 
V, considering V as a K-digit number 
to the base K. 


Rounds a single element, vector, or ma- 
trix of data to the left or right of the 
decimal point. 


Smooths a vector by calculating a 
weighted moving average. 


Syntax 


ReM COMB N 


Description 


COMB generates all the combinations of size ¥ from the set of objects 1M. 


Arguments 
M - The number of objects іп the set. 
N - The size of the combinations (should be strictly positive and less than or equal to M). 
Result 
R - А matrix with N!M rows and М columns, each row of which is a combination. 
Example 
5 COMB 2 
1 2 
1 3 
1 4 
1 5 
2 3 
2 ц 
2 5 
3 4 
3 5 
4 5 


Notes and Hints 
1) AWS FULL error message may be generated if arguments that are too large are specified. 
2) To get all combinations of size М from a set of objects OBJ, use 
OBJ[CoOBJ) COMB N] 
Source 
К. Hui 


LP. Sharp Associates Limited 
Calgary, Alberta 
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PERMUTE 

Syntax 

R-PERMUTE V 

Description 

Given a permutation of integers, PERMUTE calculates a different permutation. 

Arguments 

V - A vector containing a permutation of the integers 0,1,2,...,X-1 where Х>2. 

Result 

R - The permutation obtained from V by considering V as a X-digit number to the base К, and 
adding (К-1) base К successively until the sum contains each of the digits 0,1,2,...,K-1. 
Execution of PERMUTE X-factorial times, starting with V as any permutation, gives all X 
-factorial permutations of 0,1,2,...,X-1. 


Example 


PERMUTE 0 1 2 


0 2 1 

PERMUTE 0 2 1 
102 

PERMUTE 1 0 2 
di 40 

PERMUTE 1 2 0 
20 1 

PERMUTE 2 0 1 
210 

PERMUTE 210 
ov. 2 
Reference 


Smillie, К.М. Statpack 2: An APL Statistical Package, Publication No. 17, Department of. Compu- 
ting Science, University of Alberta, 1969. 


Source 
Dr. K.W. Smillie 
Department of Computing Science 


University of Alberta 
Edmonton, Alberta 
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ROUND 
Syntax 
Вей ROUND DATA 
Description 
This program rounds an array of numbers to the left or right of the decimal point. 
Arguments 
DATA - The array of numbers to be rounded (vector, scalar, or matrix). 
N - Positive № requests DATA to be rounded Ӯ places to the right of the decimal point. 


Negative N requests DATA to be rounded 17 places to the left of the decimal point. (Remem- 
ber to use the negative sign above 2 on the keyboard.) 


Result 
R -Тһе array of numbers rounded to № decimal places. 
Examples 
DATA 
414.285714 505.714286 137.142857 520 
431.428571 157.142857 205.714286 35.7142857 
372.857143 250 95.7142857 517.142857 
42.8571429 438.571429 278.571429 288.571429 
361.428571 274.285714 514.285714 295.714286 
Певец ROUND DATA 
414.2857 505.7143 137.1429 520 
431.4286 157.1429 205.7143 35.7143 
372.8571 250 95.7143 517.1429 
42.8571 438.5714 278.5714 288.5714 
361.4286 274.2857 514.2857 295.7143 
Оеве 1 ROUND DATA 
410 $10 140 520 
430 160 210 40 
370 250 100 520 
40 440 280 290 
360 270 510 300 
Source 


L. Gibson, S. Maxwell, S. Swaminathan 
Institute of Computer Science 
University, of Guelph 

Guelph, Ontario 
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SMOOTH 
Syntax 
ReSMOOTH DATA 
Description 
SMOOTH smooths a vector of data by calculating weighted moving averages. 
Arguments 
DATA - The vector of data to be smoothed. 
Input 


The program requests a set of weights to be used in smoothing (remember, the number of weights 
determines the number of terms summed for each new term in the result). 


Result 

R - А vector of length 1+ (pDATA)-(pWEIGHTS). If the sum of the weights is not equal to 1, 
each weight will be divided by this sum. 

Example 
DATA 


205 4 151 218 270 16 273 147 180 211 25 136 155 
17 272 68 66 49 247 277 


ReSMOOTH DATA 

ENTER THE WEIGHTS 

D: 
33.2.5 
R 

137.8 240.4 223.9 127.4 220.7 132.9 201.3 185.6 
108.7 136.3 112.2 80.3 185.9 293.5 128.2 58.1 
153.1 202.6 

Restrictions 

Only vector input is allowed. 

Methodology 


SMOOTH will calculate a moving average in which each figure is replaced by the mean of itself and 
values corresponding to a number of preceding and succeeding periods with the specified weights. 


Reference 


Freund, J.E. Modern Elementary Statistics, Second Edition, Prentice-Hall, 1964, Chapter 17.4. 
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Source 


L. Gibson, S. Maxwell, S. Swaminathan 
Institute of Computer Science 

University of Guelph 

Guelph, Ontario 
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Function 


Header 


ReBINOMTEST X 
Т«САТАМОУА N 


CONCORDANCE 


ReCOXSTUART X 


ReFRIEDMAN X 


KENDALL 


ReN KRUSKAL X 


ReN MANN W 


ReX MCNEMAR Y 


Re MEDIANTEST X 


R+X MEDIAN2TEST Y 


ReC MOSES E 


ReNDTRI P 


RePCONFINT X 


PTBISERIAL 


ReQTEST X 


31 NONPARAMETRIC 


FUNCTIONS 


Documentation 


BINOMTESTHOW 
CATANOV AHOW 


CONCORDANCEHOW 


COXSTUARTHOW 


FRIEDMANHOW 


KENDALLHOW 


KRUSKALHOW 


MANNHOW 


MCNEMARHOW 


MEDIANTESTHOW 


MEDIAN2TESTHOW 


MOSESHOW 


NDTRIHOW 


PCONFINTHOW 


PTBISERIALHOW 


QTESTHOW 
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Description 


Performs the binomial test. 


Performs an analysis of variance on ca- 
tegorical data. 


Computes the Kendall coefficient of 
concordance. 


Performs the Cox-Stuart test for trend. 


Performs the Friedman 2-way analysis 
of variance. 


Computes the Kendall-tau rank order 
correlation coefficient. 


Performs the Kruskal-Wallis l-way 
analysis of variance. 


Computes the Mann-Whitney 1) statis- 
tic and its one-tailed probability. 


Performs the McNemar test for 
changes. 


Performs the median test on several 
groups. 


Performs the median test on 2 groups. 


Performs the Moses test of extreme 
reactions. 


Computes the percentile and ordinate 
of the normal distribution. 


Computes confidence limits for the P of 
a binomial distribution. 


Computes the point-biserial correlation 
coefficient and the F-value for a test of 


significance. 


Performs Cochran’s Q-test. 


Y-RANDOMTEST X 


ReRANK X 


ReX RHO Y 


ReRUNS X 


Y-RUNSTEST X 
ReX SIGNTEST Y 
ReX UTEST Y 


ReX WALD Y 


RANDOMTESTHOW 


RANKHOW 


RHOHOW 


RUNSHOW 


RUNSTESTHOW 
SIGNTESTHOW 
UTESTHOW 


WALDHOW 
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Performs the test for runs above and 
below the median. 


Computes the ranks of the observations 
in a vector or matrix. 


Computes the Spearman rank correla- 
tion coefficient corrected for ties. 


Computes the number of runs in a di- 
chotomized string. 


Performs the runs test for randomness. 
Performs the sign test. 
Performs the Mann-Whitney U-test. 


Performs the Wald-Wolfowitz runs 
test. 


BINOMTEST 
Syntax 
ReBINOMTEST X 
Description 


BINOMTEST computes the tails of the binomial distribution. 


Argument 

X - А vector of length 3, containing М, X and P, where N is the total number of trials, P is the 
probability of success on each trial, and X is the specification for the tails as discussed below. 

Result 

R - А vector of length two, containing the tails of the B(N,P) distribution from 0 to X and from 
N-X to N. 

Example 


N-15, X-5, P-.3 
Re-BINOMTEST 15 5 .3 
R 
0.7216214402 0.003652521008 
Notes and Hints 


1) For other functions related to the binomial probability distribution, see BINOM and 
BINOMIALPROB in the workspace 33 PROBDIST. 


2) A normal approximation can be used for this test, when P is close to .5 and 225. See the 
workspace 33 PROBDIST for a variety of functions dealing with the normal distribution. 


References 
Siegel, S. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 4. 


Snyder, M. NONPAR: An APL Nonparametric Statistical Package, Bar-Ilan University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Bar-Ilan University 
Ramat-Gan, Israel 
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CATANOVA 
Syntax 
T«-CATANOVA N 
Description 
CATANOVA performs an analysis of variance on categorical data. 
Argument 


8 - A matrix of frequencies where И 7; 71 is the number of respondents in the j-th category for 
the i-th group. 


Result 
T - А 3x3 mauix containing the analysis of variance data as given by Light and Margolin in 
JASA, 1971. The first column contains the sums of squares: BSS, WSS, TSS. The second 
column contains BSS:75S and the degrees of freedom. The third contains the Light-Margolin 
statistic and the usual chi-square statistic. 
Example 
Taken from Light and Margolin, page 540. See References. 
DATA 
62 121 26 33 84 
61 159 41 20 20 


ПЕВ+САТАМОТА DATA 


4.667376514 0.02097842481 51.69083873 
217.8172264 4 47.0091926 
222.4846029 0 0 


Notes and Hints 
For other analysis of variance programs, see the workspaces COVARIANCE and ANOVA in library 34. 
References 


Light, R.J. and B.H. Margolin. “An Analysis of Variance for Categorical Data”, Journal of the 
American Statistical Association, Vol. 66, No. 335, Sept. 1971, pp. 534-544. 


Snyder, M. NONPAR: An APL Nonparametric Statistical Package, Ваг-Пап University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Bar-Ilan University 
Ramat-Gan, Israel 


78 


CONCORDANCE 
Syntax 
CONCORDANCE 
Description 


CONCORDANCE computes Kendall's coefficient of concordance, М, a measure of the degree of agreement 
or concordance among a series of rankings of a set of cases. 


Input 


Тһе user is asked whether a correction for ties is to be performed. The matrix of ranks is then 
requested. 


Output 


The printed output consists of the coefficient of concordance, and, if the number of cases is more than 
7, the chi-square value and degrees of freedom Гог a significance test. 


Example 


Taken from Siegel, page 230. See Reference. 


DATA 
16 32 5 4 
1 5.56 4 2 3 
E3 2 5. Apo 
CONCORDANCE 
DO YOU WISH TO CORRECT FOR TIES? 
NO 
ENTER MATRIX OF RANKS. 
П: 
DATA 
KENDALL COEFFICIENT OF CONCORDANCE, W..... 0.1619047619 


Notes and Hints 


1) Correcting for ties requires substantially more execution time, so it is important to correct for 
ties only when necessary. 


2) Тһе subfunctions W and Wi with the syntax 2+0 X or Z-W1 X can be used instead of 
CONCORDANCE. (W - not correcting for ties; W1 - correcting for ties.) 


Reference 


Siegel, 8. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 9. 
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COXSTUART 
Syntax 
ReCOXSTUART X 
Description 
COXSTUART performs the Cox-Stuart test for trend. 
Argument 
X  - The vector of observations. 
Result 
R - А 5-element vector containing: 
R[1] - The number of non-zero differences between the two segments of X 
R[2] - The smaller number of like-signed differences 
ВІЗ) - The probability figure .5 
RUS] and 151 - The two-tailed probabilities if R[11<25 
RLS] - The Z statistic И R[1]>25. 
Output 
The program prints an annotated table of R. 
Example 
Taken from Conover, page 132. See References. 
x 
18.6 12.2 104 220 110 86 92.8 74.4 75.4 51.7 
29.3 16 14.2 10.5 123 190 138 98.1 88.1 
80 75.6 48.8 27.1 15.7 
ReCOXSTUART X 
WE-TAILED P-VALUE-0.3872070313 


R 
12 5 0.5 0.3872070313 0.3872070313 


80 


References 
Conover, W.J. Practical Nonparametric Statistics, John Wiley & Sons Inc., 1971, Chapter 3. 


Snyder, M. NONPAR: An APL Nonparametric Statistical Package, Bar-Ilan University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Bar-Ilan University 
Ramat-Gan, Israel 
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FRIEDMAN 
Syntax 
ReFRIEDMAN X 
Description 
FRIEDMAN performs the Friedman 2-way analysis of variance by ranks. 
Argument 
X - A matrix of N (groups) rows and X (treatments) columns. 
Result 
R - The Friedman (chi-square) test statistic. 
Output 
The function prints N, K, and the chi-square statistic. 
Example 


Taken from Conover, page 167. See References. 


Xe12 494 321423131424312442133124,0 


132424133124413242313124 


овкоювно косо в + 
BNEREORNEENY 
ооо кә кє кә кэ В КО кэ со N X 
FRNFUF FOF ERE 


RePRIEDMAN X 
w= 12 
Ke 4 
CHI-SQUARE-B.9 


R 


Notes and Hints 


Since ranking is performed by the program, the user need simply enter the raw scores in the proper 
format. 


References 
Conover, W.J. Practical Nonparametric Statistics, John Wiley & Sons Inc., 1971, Chapter 5. 
Siegel, S. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 7. 


Snyder, M. NONPAR: An APL Nonparametric Statistical Package, Bar-Ilan University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Ваг-Пап University 
Ramat-Gan, Israel 
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KENDALL 
Syntax 
KENDALL 
Description 


The functions KENDALL, KENDALLTAU, TAU, ТА, and PARTKENDALL аге all concerned with the 
Kendall tau rank-order correlation coefficient. 


Input 

The first input requested by the functions KENDALL and PARTKENDALL is an indication as 10 whether 
а correction for ties is required. The user is then asked to enter the data in the form of an /х2 (Гог 
KENDALL and KENDALLTAU) or Nx3 (for PARTKENDALL) matrix, with the columns representing the 


different rankings and the rows the cases. 


For KENDALL and PARTKENDALL the data should be in the form of ranks, while for KENDALLTAU 
the actual observations are expected. 


Output 

The output of KENDALL and KENDALLTAU consists of the correlation coefficient and, if the sample size 

is 10 or more, the standard deviation and Z-value for a test of significance. The output of 

PARTKENDALL consists only of the partial correlation coefficient. 

Examples 

(See next page.) 

Notes and Hints 

1) KENDALL, KENDALLTAU, and PARTKENDALL are all called by simply typing the function name; 
TAU and TAU1 each have a single argument: an №2 matrix giving the rank scores of № items 


by each of two judges. The scalar value returned is the tau coefficient. 


2) ТАЈ1 corrects for ties; TAU does not. Since correcting for ties requires substantially more execution 
time, it is advisable to use 7401 only when necessary. 


3) For PARTKENDALL, the partial correlation produced is that of the first two columns, with the 


third partialled out. PARTKENDALL works by computing the three tau coefficients, then using them 
10 compute the partial. 
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Example - KENDALL 
Taken from Siegel, page 217. See Reference. 


DATA 


Bes 
e 


НА 
о ‹л кюю ы с О оюҥ юк о 


= 
к^ зз Юэ т оо охо Оз‹л оююю 


НА 


KENDALL 
DO YOU WISH TO CORRECT FOR TIES? 


ENTER MATRIX OF RANKS. 
D: 

DATA 
KENDALL TAU RANK-ORDER CORRELATION..... 0.6666666667 
STANDARD DEVIATION ... + 0.2209559884 
Z-VALUE FOR SIGNIFICANCE LEVEL 3.017192118 


KENDALLTAU 
ENTER THE Nx2 MATRIX OF DATA VALUES 
Пе Й 

DATA 
KENDALL TAU - 0.6666666667 
STANDARD DEVIATION.. -. 0.2209559884 
Z-VALUE FOR SIGNIFICANCE LEVEL... 3.017192118 
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Example - PARTKENDALL 


Taken from Siegel, page 227. See Reference. 


DATA 
3 2 15) 
4 6 1.5 
2 5 3.5 
t i 3.5 
8 10 5 
11 9 6 
10 8 7 
6 3 8 
7 4 9 
12 12 10.5 
5 7 10.5 
9 11 12 
PARTKENDALL 
DO YOU WISH TO CORRECT FOR TIES? 
YES 
ENTER MATRIX OF RANKS. 
D: 
DATA 
KENDALL PARTIAL CORRELATION COEFFICIENT..... 0.6135709 
Reference 


Siegel, S. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 9. 
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KRUSKAL 
Syntax 
ReN KRUSKAL X 
Description 


KRUSKAL performs the Kruskal-Wallis l-way analysis of variance. 


Arguments 

N - The vector whose length is the number of groups and whose elements are the group sizes. 
X - A long vector containing all of the observations, one after another. 

Result 

R  - Contains the degrees of freedom and the Kruskal-Wallis statistic. 

Example 


Taken from Conover, page 258. See References. 


bel 

83 91 94 89 89 96 91 92 90 
X2 

91 90 81 83 84 83 88 91 89 84 
Хз 

101 100 91 93 96 95 94 
Xu 


78 82 81 77 79 81 80 81 

В«9 10 7 8 KRUSKAL Х1,Х2,Х3,Х% 
3 mm 
Notes and Hints 


The program does not correct for ties. 
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References 
Conover, W.J. Practical Nonparametric Statistics, John Wiley & Sons Inc., 1971, Chapter 5. 
Siegel, S. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 8. 


Snyder, M. NONPAR: An APL Nonparametric Statistical Package, Bar-Ilan University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Ваг-Пап University 
Ramat-Gan, Israel 
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Syntax 
ReN MANN W 
Description 


MANN computes the Mann-Whitney U statistic and its one-tailed probability. The Z statistic from 
which the probability is calculated by a normal approximation is corrected for ties. 


Arguments 


N - A vector of raw scores. 
W - А vector of raw scores, not necessarily the same length as N. 


Result 

R - A vector giving the joint rank ordering of X and У. 
Output 

In addition to the result, MANN outputs the values labelled: 
2; 

PROBABILITY: 


Example 


13 12 12 10 10 10 109 8 8 7 7 7 7? 7 6 


17 16 15 15 15 14 14 14 13 13 13 12 12 12 12 
11 11 10 10 10 8 8 6 


ReX MANN Y 
U: 30% 
2:3.45095466 
PROBABILITY: 0.0002793868927 
R 
29.5 24.5 24.5 16 16 16 16 12 9.5 9.5 5 5 5 5 5 1.5 39 38 36 
36 36 33 33 33 29.5 29.5 29.5 24,5 24.5 24.5 24.5 20.5 
20.5 16 16 16 9.5 9.5 1.5 
Notes and Hints 
1) Jf either argument has 20 or fewer values the significance of U is better determined from tables. 


2) For a similar function, which does not correct for ties, see UTEST in this workspace. 
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Reference 

Siegel, S. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 6. 
Source 

Dr. G.H. McLaughlin 

Newhouse Communications Center 

Syracuse University 


Syracuse, N.Y. 


(Modified by F. Arthur, LP. Sharp Associates Limited) 
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MCNEMAR 
Syntax 
ReX MCNEMAR Y 
Description 


MCNEMAR performs the McNemar test for significant changes. 


Arguments 

X - A vector representing the initial set of observations. 

Y -~ A vector of equal length to X containing after treatment observations. 
Result 

R - The chi-square value ((A-D)*2)+A+D. 

Outputs 


A - The number of times ХСІЈ and Y{I] are different and XEI)-X[1]. 
D - The number of times Х[7] and УСІ) are different and ХІІІ = Х611. 


Example 


012112002100012300100410 


Y 
1120012110042 2024323221001 100 


ReX MCNEMAR Y 
А=7 
Dz5 
CHI-SQUARE-0.3333333333 


R 
0.3333333333 


Notes and Hints 
1) Тһе sets X and Y must consist of sequences of only two different values. 


2) Тһе correction for continuity and the correction for small expected values are not included. 
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References 
Siegel, S. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 5. 


Snyder, M. NONPAR: An APL Nonparametric Statistical Package, Ваг-Пап University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Bar-Ilan University 
Ramat-Gan, Israel 
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MEDIANTEST 
Syntax 
RN MEDIANTEST X 
Description 
MEDIANTEST performs a median test on several groups. 
Arguments 


N - A vector of all the group sizes. 
X - A vector containing all observations, one group after another. 


Result 
R - The degrees of freedom and chi-square value. 
Example 


Taken from Conover, page 169. See References. 


X1 

83 91 94 89 89 96 91 92 90 
X2 

91 90 81 83 84 83 88 91 89 84 
Хз 

101 100 91 93 96 95 94 
ху 


78 82 81 77 79 81 80 81 

Re8 10 7 8 MEDIANTEST X1,X2,X3,X4 
3 Қар Жетсе” 
Notes and Hints 


C is a global variable which is a matrix of the number of observations in each group above and below 
the common median. The number of rows in C equals the number of groups. It has 2 columns. 
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References 
Conover, W.J. Practical Nonparametric Statistics, John Wiley 8. Sons Inc., 1971, Chapter 4. 
Siegel, S. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 6. 


Snyder, M. NONPAR: An APL Nonparametric Statistical Package, Bar-llan University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Ваг-Пап University 
Ramat-Gan, Israel 
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MEDIAN2TEST 
Syntax 
ReX MEDIAN2TEST Y 
Description 


MEDIAN2TEST performs a median test on two groups. 


Arguments 

X - A vector of numbers, representing one group. 

Y - A vector, not necessarily of equal length to X, representing another group. 

Result 

R - The chi-square value of the 2x2 matrix D, given by X and Y above and below the common 
median. 

Example 
x 

13 7 12 8 7 7 10 10 6 12 8 10 9 7 7 10 7 
E 


10 14 13 13 8 17 12 15 8 11 16 14 12 6 12 
15 11 15 13 12 10 10 14 
ReX MEDIAN2TEST Y 


R 
12.37851662 

D 

3 17 

14 6 


Notes and Hints 

In calculating the chi-square value, no correction for continuity is made. 

References 

Siegel, S. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 6. 


Snyder, M. NONPAR: An APL Nonparametric Statistical Package, Ваг-Пап University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Ваг-Пап University 
Ramat-Gan, Israel 
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MOSES 
Syntax 
КС MOSES E 
Description 
MOSES performs the Moses test of extreme reactions. 


Arguments 


C - The vector of observations from the control group. 
E - The vector of observations from the experimental group. 


Result 
R - A vector of 5 elements: 


R[1] - The probability 

ЕГ21 - No. of control cases 

РІЗ] - No. of experimental cases 

R[4] - The amount that the span exceeds Е(21 
RCS] - The span. 


Example 
с 

12 18 6 13 13 3 10 10 11 
Е 

25 5 14 19 0 17 15 8 8 
ReC MOSES E 
R 


0.1470588235 9 9 5 14 

Notes and Hints 

H is always taken as 0. 

References 

Siegel, S. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 6. 


Snyder, M. NONPAR: An APL Nonparametric Statistical Package, Bar-Ilan University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Ваг-Пап University 
Ramat-Gan, Israel 
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NDTRI 
Syntax 
ReNDTRI P 
Description 
NDTRI computes the percentile and ordinate of the normal distribution. 
Argument 
P - А probability. 
Result 
R - A vector of 2 elements: 
R[1] - The point on the normal curve corresponding to P; that is, the probability that a standard 
normal deviate will be less than or equal to R[1] is P. 
R[2] - The ordinate (the height of the bell shaped curve) at 8/11. 
Example 


NDTRI .95 
1.64521144 0.1030749562 


Reference 


Snyder, M. NONPAR: An APL Nonparametric Statistical Package, Bar-Ilan University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Bar-Ilan University 
Ramat-Gan, Israel 
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PCONFINT 
Syntax 
RePCONFINT X 
Description 
PCONFINT computes confidence limits for the P (probability of success) of a binomial distribution. 
Argument 
X - A 3-element vector: 


X[1] - Number of successes 
X[2] - Number of trials 


X[3] - Alpha 
Result 
R - А 2-element vector: 


R[1] - The lower limit 
RL2] - The upper limit 


Example 
RePCONFINT 9 15 .05 
ise achat 0.847972522 
Notes and Hints 
1) Тһе program uses the normal approximation to the binomial distribution. 
2) For the inverse of this function, see BINOMTEST in this workspace. 


Reference 


Snyder, M. NONPAR: Ап APL Nonparametric Statistical Package, Bar-Ilan University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Bar-Ilan University 
Ramat-Gan, Israel 
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PTBISERIAL 
Syntax 
PTBISERIAL 
Description 


PTBISERIAL computes the point-biserial correlation coefficient and the t-value for a test of sig- 
nificance. 


Input 
The user is asked to supply the ¥x2 input data matrix, with the two variables in the two columns. 
Output 


1) The point biserial correlation. 
2) The t-value. 


Example 


The first column of the data represents scores on an IQ test, while the second column has a 1 for 
a pass and 0 for a fail on another test. 


PTBISERIAL 
ENTER THE Nx2 MATRIX OF DATA VALUES 


DATA 
POINT-BISERIAL CORRELATION СОЕРР....0.6258740094 
T-VALUE FOR SIGNIFICANCE LEVEL...... 2.779874283 
Notes and Hints 


This measure of association is used when it is desired to determine the relationship between two 
variables, one of which is continuous and the other of which is dichotomous. 
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References 
MeNemar, О. Psychological Statistics, John Wiley & Sons Inc., 1962. 


Pearson, K. “Оп a New Method of Determining Correlation When One Variable is Given by 
Alternative and the Other by Multiple Categories", Biometrika, Volume 7, 1910, p. 248. 


Treloar, A.E. Correlation Analysis, Burgess Publishing Co., Minneapolis, 1942. 
Source 

Dr. Mitchell Snyder 

Computer Centre 


Ваг-Пап University 
Ramat-Gan, Israel 


QTEST 
Syntax 
ReQTEST X 
Description 
QTEST performs Cochran's Q-test. 
Argument 
X - An NxK boolean matrix, where N is the number of subjects and X is the number of conditions. 
Result 
R - A vector of 2 elements: 


R[1] - The degrees of freedom. 
R[2] - The chi-square value for Cochran's Q-test. 


Example 
Taken from Siegel, page 164. See References. 


x 


ввеврервеовновевоово 
|!HopBpPHBHBHOOPRPPOOnBO 
осовоооввпооооофоооооо 


ReQTEST X 
R 
2 16.66666667 
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References 
Siegel, 5. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 4. 


Snyder, M. NONPAR: Ап APL Nonparametric Statistical Package, Bar-Ilan University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Bar-Ilan University 
Ramat-Gan, Israel 
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RANDOMTEST 
Syntax 
Y«-RANDOMTEST X 
Description 
RANDOMTEST performs the test for runs above and below the median. 


Argument 
X - A vector of observations. 


Result 


Y  - A vector of 6 elements containing (in Siegel’s notation): 
R - The number of runs above and below the median. 
N1 - The number of times the first case (i.e, ¥[1]<MEDIAN X) appears. 
N2 - The length of X, minus N1. 
The expected value of R, 
The standard deviation of R, and 
The Z-value corresponding to К. 


Output 
Тһе elements of Y are printed, together with the appropriate labels. 
Example 


ReRANDOMTEST 50?200 
NO OF RUNS-26 
N1-25 
N2=25 
MEAN=26 
ST.DEV.-3.499271061 
250 


R 
26 25 25 26 3.499271061 0 


Notes and Hints 
Тһе program performs а runs test on the number of times the elements of X are above or below the 


median. This is just one of the tests for randomness in this workspace, the others being 
COXSTUART and WALD. 
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References 
Siegel, S. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 4. 


Snyder, M. NONPAR: An APL Nonparametric Statistical Package, Bar-Ilan University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Ваг-Пап University 
Ramat-Gan, Israel 
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RANK 
Syntax 
ReRANK X 
Description 
RANK generates the ranks of observations in a vector or matrix in ascending order. 
Argument 
X - A vector or matrix of observations. 
Result 
R - An array of the same shape as X in which each observation has been replaced by its rank. 


Example 


Y+RANK X 

Y 
4 1 2 3 
1 4 3 2 
2 4 3 1 
1 % 3 2 
2 % 1 3 


Notes and Hints 
Where X is а matrix, the ranking is done within the rows of the matrix. 
Reference 


Snyder, M. NONPAR: An APL Nonparametric Statistical Package, Bar-Ilan University, Israel, 
1972. 


Source 

Dr. Mitchell Snyder 
Computer Centre 
Bar-Ilan University 


Ramat-Gan, Israel 


(Modified by L. Gibson, I.P. Sharp Associates Limited) 
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RHO 
Syntax 
R-X RHO Y 
Description 


RHO computes the Spearman rank-order correlation coefficient corrected for ties. It also evaluates the 
probability of the observed REO value by means of a one-tailed t-test. 


Argument 


X - A vector of scores. 
Y - A vector of scores equal in length to X. 


Result 


R - А 2-column matrix giving the rank-order of the left argument on the left and the rank order 
of the right argument on the right. 


Output 

In addition to the result, RHO provides the values labelled: 

RHO: 

T: 

PROBABILITY: 

However if REO is 1 or -1 the t statistic and probability are not given. 


Example 


Taken from Siegel, page 205. See Reference. 


X 

82 98 87 40 116 113 111 83 85 126 106 117 
x 

42 46 39 37 65 88 86 56 62 92 54 81 
ReX RHO Y 

RHO: 0.8181818182 

T: 4.5 


PROBABILITY: 0.0005661690045 
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А 


ваювкожюооовосов 
въ 
«о л кюю ы ою © юк кю + W 


P 
ы 


S 


Notes and Hints 


1) И either argument has fewer than 10 values, the significance of RHO is better determined from 
tables. 


Reference 

Siegel, S. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 4. 
Source 

Dr. СН. McLaughlin 

Newhouse Communications Center 


Syracuse University 
Syracuse, N.Y. 
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RUNS 
Syntax 
ReRUNS X 
Description 


RUNS computes the number of runs in a dichotomized string. 


Argument 
X - The vector to be analyzed. 
Result 
Е - The number of runs in X. 
Example 
x 
0.11 100110 1 
Y-RUNS X 
Y 
6 


Notes and Hints 

Only two different values should appear in the set X. 

References 

Siegel, 5. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 4. 


Snyder, M. NONPAR: An APL Nonparametric Statistical Package, Bar-Ilan University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Bar-Ilan University 
Ramat-Gan, Israel 
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RUNSTEST 
Syntax 
Ye-RUNSTEST X 
Description 


RUNSTEST performs the runs test for randomness. 


Argument 

X - A vector (character or numeric) containing only 2 different values. 

Result 

Y  - A vector of length 6 containing (in Siegel’s notation); R, the number of runs; N1, the number 


of times Х[ 1] appears; N2, the length of X, minus N1; the expected value of R; the standard 
deviation of R; and the Z-value corresponding to R. 


Output 

Тһе elements of Y are printed, together with the appropriate labels. 
Example 

Taken from Siegel, page 57. See References. 


Xx 
MFMFMMMFFMFMFMFMMMMFMFMFMMFFFMFMFMFMMFMMFMMMMFMFMM 


Y«RUNSTEST X 
NO OF RUNS-35 
1:30 
N2-20 
MEAN=25 
ST. DEV .=3. 356382893 
222.979397858 


M 
35 30 20 25 3.356382893 2.979397858 
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References 
Siegel, S. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 4. 


Snyder, M. NONPAR: An APL Nonparametric Statistical Package, Bar-Ilan University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Bar-Ilan University 
Ramat-Gan, Israel 
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SIGNTEST 
Syntax 
ВХ SIGNTEST Y 
Description 
SIGNTEST performs the sign test on two related samples. 
Arguments 


X - A vector of numbers. 
Y - A vector of equal length to X. 


Result 

R - A 5-element vector: 

R[1] - The number of non-zero differences 

R[2] - The smaller number of like-signed differences 
R[3] - P=0.5 


ЕГУ] and Е[5] - Contain the 2 tails for the binomial distribution, if Р[1]<25. 
R[^] - The Z statistic if АГ1]>25. 


Output 
The values of R are printed, with appropriate captions. 
Example 


Taken from Siegel, page 70. See References. 


4.4175 35.73. 2 26 Bebe BeBe. И ИЕ ЖЩ, 


2 3 0.53 33 32 3 2 2 5 2.5 9 1 
Rex SIGNTEST Y 

N=14 

X=3 

ONE-TAILED P-VALUE=0 .02868652344 


R 
14 3 0.5 0.02868652344 0.02868652344 
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Notes and Hints 

The program does not correct for ties. 

References 

Siegel, S. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 5. 


Snyder, M. NONPAR: An АРІ. Nonparametric Statistical Package, Ваг-Пап University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Bar-Ilan University 
Ramat-Gan, Israel 
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UTEST 
Syntax 
ВХ UTEST Y 
Description 


UTEST computes the Mann-Whitney U statistic. 


Arguments 

X - A vector of observations. 

Y - A vector, not necessarily of the same length as X. 

Result 

R - A vector of length 2 containing Ше Mann-Whitney U statistic and the corresponding Z 
statistic. 

Example 


Taken from Siegel, page 122. See References. 


X 
13 12 12 10 10 10 109 8 8 777776 
Y 
17 16 15 15 15 14 14 14 13 13 13 12 12 12 12 
11 11 10 10 10 8 8 6 
ReX UTEST Y 
R 
30% 3.426241444 


Notes and Hints 

For a similar function, which corrects for ties, see MANN in this workspace. 

References 

Siegel, S. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 6. 


Snyder, M. NONPAR: An APL Nonparametric Statistical Package, Bar-Ilan University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Bar-Ilan University 
Ramat-Gan, Israel 
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WALD 
Syntax 
ReX WALD Y 
Description 
WALD performs the Wald-Wolfowitz runs test. 
Arguments 


X - A vector of observations. 
Y - A vector of observations. 


Result 


В - A vector of length 6, containing the number of runs, М1, N2 (Siegel’s notation), the mean, 
standard deviation and the Z-value. 


Output 


The program also prints an annotated table of R. 


Example 
x 

16 5 6 10 12 18 13 11 14 19 17 8 4 9 15 
Y 

7 12 8 15 1 17 13 20 18 16 8 6 2 4 10 9 
ReX WALD Y 

NO OF RUNS-25 

N1-16 

12515 


MEAN=16 . 48387097 
ST.DEV .=2. 734144529 
Z=3.114732576 


R 
25 16 15 16.48387097 2.734144529 3.114732576 


Notes and Hints 


There are no corrections for ties or continuity. 
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References 
Siegel, S. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, 1956, Chapter 6. 


Snyder, M. NONPAR: An APL Nonparametric Statistical Package, Ваг-Пап University, Israel, 
1972. 


Source 
Dr. Mitchell Snyder 
Computer Centre 


Bar-Ilan University 
Ramat-Gan, Israel 
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31 PARAMETRIC 


FUNCTIONS 

Function 

Header Documentation Description 

Рем BART V BARTHOW Tests homogeneity of variances (non- 
conversational ). 

BARTLETT BARTLETTHOW Tests homogeneity of variances (con- 
versational ). 

DUNCAN DUNC ANHOW Performs Duncan’s test on a set of 
means, 

ReA SCHEFFE X SCHEFFEHOW Performs linear comparisons among 


population means using Scheffe's me- 
thod for multiple comparisons. 
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BART 
Syntax 
РМ BART V 
Description 
BART tests homogeneity of variances for normally distributed populations. 


Arguments 


V ~- А vector of variances. 
N - The vector of corresponding sample sizes (enter only 1 number if they are all equal). 


Result 


Р - А 3-element vector result containing the pooled variance, chi-square value and degrees of 
freedom. 


Example 

Taken from Steel and Torrie, page 347. See References. 
P481 44 13 BART 58.57 76.84 79.67 

ри ON 1.276794401 2 

Notes and Hints 

See BARTLETT for a more conversational version of this test. 


References 


Brownlee, КА. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Inc., New York, 1961, Chapter 9.4. 


Steel, К.С. and J.H. Torrie. Principles and Procedures of Statistics, McGraw-Hill, London, 1960. 
Source 

L. Gibson, S. Maxwell, S. Swaminathan 

Institute of Computer Science 


University of Guelph 
Guelph, Ontario 
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BARTLETT 
Syntax 
BARTLETT 
Description 
This function tests whether the variances of K normally distributed populations are equal. 
Input 
BARTLETT conversationally requests a set of variances and their respective sample sizes. 
Output 
BARTLETT returns the pooled variance, chi-square value and degrees of freedom. 
Example 
Taken from Steel and Torrie, page 347. See References. 


BARTLETT 
ENTER THE VARIANCES 


Q: 
58.57 76.84 79.67 
ENTER THE RESPECTIVE SAMPLE SIZES FOR EACH VARIANCE (1 ENTRY IF ALL ARE EQUAL) 
Li: 
81 44 13 
THE POOLED VARIANCE = 66.26488889 
DEGREES OF FREEDOM = 2; BARTLETT'S CHI-SQUARE = 1.276794401 
Notes and Hints 
See BART for a non-conversational, non-formatted version. 


References 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Inc., New York, 1961, Chapter 9.4. 


Steel, R.G. and J.H. Torrie. Principles and Procedures of Statistics, McGraw-Hill, London, 1960. 
Source 

L. Gibson, S. Maxwell, S. Swaminathan 

Institute of Computer Science 


University of Guelph 
Guelph, Ontario 
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DUNCAN 
Syntax 
DUNCAN 
Description 


This function will test the differences of a set of N treatment means with equal or unequal sample 
sizes using Duncan’s Multiple Range Test. 


Input 

DUNCAN conversationally requests: 

1) The means. 

2) The sample sizes. (Note: If all sample sizes are equal enter the value only once.) 

3) The mean square error (not the standard error of the mean). 

Output 

The output takes the following form: 

1) A table of means and their associated ranks, which are used later to identify the means. 


2) A list of means (identified by ranks). A line is typed under any sequence of means in which 
there is found to be no significant difference at the 5 percent level. 


Example 


Taken from Steel and Torrie, page 108. See References. 
(See next page.) 


Notes and Hints 
1) The significant studentized ranges are calculated by the program. 
2) The means can be entered by typing a variable name when asked. If entering directly, and more 


than one line is required, type ,П at the end of the first line, and a second input line may then 
be typed. 
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Example 


DUNCAN 
ENTER MEANS 


28.8 24 14.6 19.9 13.3 18.7 
ENTER THE RESPECTIVE SAMPLE SIZES/MEAN (1 ENTRY IF ALL ARE EQUAL) 
0: 


5 
ENTER THE DEGREES OF FREEDOM OF THE ERROR 
0: 

24 
ENTER THE MEAN SQUARE ERROR 
D: 

11.79 
IDENTIFICATION MEAN 

6 28.800 

5 24.000 

2 14.600 

4 19.900 

1 13.300 

3 18.700 

1 2 6 

Assumptions 


Remember Duncan's test should not be used to compare means arising from various levels of one 
treatment in an experimental design. 


References 
Duncan, D.B. *Multiple Range and Multiple F Tests", Biometrics, 11, 1955, pp. 1-42. 


Kramer, C.Y. “Extension of Multiple Range Tests to Group Means with Unequal Numbers of 
Replications", Biometrics, 12, 1956, pp. 307-310. 


Steel, К.С. and J.H. Torrie. Principles and Procedures of Statistics, McGraw-Hill, London, 1960. 
Source 

L. Gibson, S. Maxwell, S. Swaminathan 

Institute of Computer Science 


University of Guelph 
Guelph, Ontario 
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SCHEFFE 
Syntax 
ReA SCHEFFE X 
Description 
SCHEFFE can be used to make all possible linear comparisons among t population means. 
Arguments 


X - an MxT matrix, where the I-th column is a sample from the Z-th population. If the I-th 
sample size is less than M, just fill the extra elements with zeros. 


A - either a t-element vector or a matrix with t columns. These аге the linear coefficients specifying 
the means comparisons that are to be tested. The row sums of А must be identically 0. 


Result 


R - а 3-column matrix with as many rows as there are rows in the argument А (one row if A 
is a vector). 


Column 1: The inner product of the rows of 4 and the sample means 
Column 2: The corresponding F-statistics 


Column 3: The corresponding probability levels (the probabilities of obtaining F-values greater than 
the F-statistics) 


Example 


The timing algorithm used in a certain timesharing system is such that CPU units consumed for 
executing identical expressions need not always be identical. П is required to compare the efficiency 
of 2 different programs, where program A consists of 2 applications of subprogram F1 and 3 applica- 
tions of subprogram F2, and program B is one application of F3 and 2 applications each of F4 and 
F5. CPU units required by each subprogram have been obtained using 15 trials for each subprogram, 
and are listed below: 
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M 
509 394 300 685 329 
507 386 311 673 283 
466 378 316 720 334 
483 396 298 723 298 
486 360 331 718 284 
479 397 292 705 298 
485 401 317 687 309 
472 39% 274 721 318 
479 372 313 743 332 
490 392 285 723 322 
487 390 286 664 299 
461 372 311 672 287 
498 462 314 664 292 
488 381 298 695 301 
453 385 290 694 291 


Let ТІ be the time required by subprogram ЕТ and 

L= (2xT1)*(3xT2)* ( 1х73)%( ох у)+( 2x75) 

Но: L=0 (The number of CPU units required by program A is the same as that required by program 
B.) 

H1: Lz0 (One of the 2 programs is significantly more efficient than the other one.) 


1 2 SCHEFFE M Е 
7173.2 13.27443167 L.2308563818 8 


Since the probability of obtaining an absolute value of 273.2 (or greater) for the test statistic is only 
4.23Е 8, we reject HO. That is, program А and program B are significantly different in their CPU 
requirements. 

Notes and Hints 


See the function ANOVA? in the workspace 34 ANOVA for performing analyses of variance on X. 


References 


1) Ot, L. An Introduction to Statistical Methods and Data Analysis, Duxbury Press, North 
Scituate, Mass., 1977. 


2) Scheffe, H. The Analysis of Variance, John Wiley and Sons Іпс., New York, 1959. 
Source 
R. Hui 


LP. Sharp Associates Limited 
Edmonton, Alberta 
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| 


Function 
Header 


P TPOT Q 


X TTEST Y 


31 TTEST 


FUNCTIONS 


Documentation 


TPOTHOW 


TTESTHOW 
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Description 

Computes 1 statistic for single popula- 
tion and tests MU=X when standard 
deviation is unknown. 


Performs 1- and 2-sample t-tests. 


Syntax 
(T, SD) TPOT (XBAR, N) 
Description 


TPOT performs a t-test on a single population, testing the null hypothesis that the sample mean is 
equal to zero, given that the standard deviation is unknown. 


Arguments 

T - The calculated t-value. 

SD - Standard deviation of the sample. 
XBAR - Mean of the sample. 

N - Sample size. 

Input 


TPOT requests a value to be entered for the level of significance of the critical region. 
Output 

1) Computed and theoretical t. 

2) Whether the null hypothesis is accepted or rejected. 

3) Тһе confidence interval, where appropriate. 


Example 


(2.75,.2) TPOT (.4,30) 
ENTER DESIRED CONFIDENCE LEVEL (E.G. .95) 


D: 

.95 
ТҮРЕ ІМ 1 OR 2 FOR THE МО ОР TAILS 
П: 


1 
THE NULL HYPOTHESIS IS REJECTED АТ THE 0.95 LEVEL 
THE ABSOLUTE VALUE OF THE COMPUTED T IS 2.75 
THE THEORETICAL VALUE OF T AT GIVEN LEVEL IS 1.697647233 
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Notes and Hints 
TPOT allows for a 1- or 2-sided alternative about the mean. 
Reference 


Dixon, W.J. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, 1957, Chapter 
9. 


Source 
L. Gibson, S. Maxwell, S. Swaminathan 
Institute of Computer Science 


University of Guelph 
Guelph, Ontario 
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TTEST 
Syntax 
X TTEST Y 
Description 


TTEST will perform a 1-sample or a 2-sample t-test. 


Arguments 
X - A vector of observations. 
Y - For the 1-sample t-test, it is the value to be tested against the mean of X; otherwise it is another 


set of observations, not necessarily the same length as X. 
Output 


1) Sample sizes and means. 

2) Variances (pooled if appropriate). 

3) Standard deviations (pooled if appropriate). 

4) Standard error (pooled if appropriate). 

5) Computed and theoretical t. 

6) Whether null hypothesis is accepted or rejected. 
7) Confidence interval where appropriate. 


Example: 1-Sample 


X 
9. 3«-5- "5-58 &- 22:54 лот 

X TTEST 5 
SAMPLE SIZE 10 
DEGREES OF FREEDOM 9 
MEAN 5.5 
STANDARD DEVIATION 3.027650354 
STANDARD ERROR OF MEAN 0.9574271078 
ENTER DESIRED CONFIDENCE LEVEL (E.G. .95) 
D: 

-95 
ТҮРЕ IN 1 OR 2 FOR THE NO ОҒ TAILS 
D: 


2 
THERE IS INSUFFICIENT EVIDENCE TO REJECT THE NULL HYPOTHESIS 
THE ABSOLUTE VALUE OF THE COMPUTED T 15 0.5222329679 
THE THEORETICAL VALUE OF T AT GIVEN LEVEL IS 2.228725137 
TO EXPRESS THE ABOVE IN А CONFIDENCE INTERVAL 


P( 3.366158138 « TRUE MEAN « 7.633841862)=0.95 
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Example: 2-Sample 


Х%85 87 92 80 84 
Ye-89 89 90 84 88 


X TTEST Y 
IS YOUR DATA PAIRED?(Y OR М) 


N 
SAMPLE SIZES 


5 5 

MEANS 85.6 88 
POOLED VARIANCE 12.4 
POOLED STANDARD DEVIATION 3.521363372 
STANDARD ERROR OF MEAN DIFFERENCE 2.227105745 
ENTER DESIRED CONFIDENCE LEVEL (E.G. 95) 
D: 

.95 
ТҮРЕ IN 1 OR 2 FOR THE NO OF TAILS 
D: 


1 
THERE IS INSUFFICIENT EVIDENCE TO REJECT THE NULL HYPOTHESIS 
THE ABSOLUTE VALUE OF THE COMPUTED T IS 1.077631812 
THE THEORETICAL VALUE OF T AT GIVEN LEVEL IS 1.860021889 


Notes and Hints 

1) You can choose a one-sided or two-sided alternative and specify the desired significance level. 

2) For the 2-sample test, a choice among several t-test calculations will be made depending upon 
whether the data is paired, and whether the sample sizes and/or variances are equal. Bartlett’s 
Test, at the 5 percent level, is used to test whether variances are equal. 


Reference 


Dixon, W.J. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, 1957, Chapter 
9. 


Source 

L. Gibson, 8. Maxwell, S. Swaminathan 
Institute of Computer Science 

University of Guelph 

Guelph, Ontario 


(Modified by F. Arthur, LP. Sharp Associates Limited) 
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Workspaces: 


LIBRARY 32 
MODEL PARAMETER ESTIMATION 


CORRELATION 
FILEREG 
LAGS 
NONLINEAR 
PROBIT 
REGRESSION 


STEPWISE 
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32 CORRELATION 


FUNCTIONS 
Function 
Header Documentation Description 
М«СМ DATA CMHOW Produces a correlation matrix. 
CORMAT DATA CORMATHOW Produces a labelled correlation matrix. 
PARTCORR PARTCORRHOW Prints a partial correlation coefficient. 
ReSCORR D SCORRHOW Calculates simple correlation coeffi- 


cients. 
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Syntax 

M«CM DATA 

Description 

CM will produce a correlation matrix (M) for the input matrix. 

Argument 

DATA - A matrix with rows as observations and columns as variables. 

Result 

M - The correlation matrix, a square matrix with as many rows and columns as DATA has variables 
(or columns). The correlation coefficient for (say) variate 1 and variate 3 would be found 


in row 1 and column 3 of M (and is of course equal to the value in row 3, column 1). 


Example 


72 74 76 100 


МєСМ DATA 

M 
1 0.302629848 0.412200046 0.701689501 
0.302629848 1 0.946460429 0.521123629 
0.412200046 0.946460429 4 0.511911641 
0.701689501 0.521123629 0.511911641 1 


Notes and Hints 


1) A labelled non-storable result can be created by using CORMAT rather than CM. (See 
CORMATHOW in this workspace.) 


2) For a function that handles missing values, see SCORR in this workspace. 
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References 
Freund, J.E. Modern Elementary Statistics, Second Edition, Prentice-Hall, 1964, Chapter 5. 


Smillie, K.W. Statpack 2: An APL Statistical Package, Second Edition, Publication No. 17, Universi- 
ty of Alberta, February 1969. 


Source 
Dr. K.W. Smillie 
Department of Computing Science 


University of Alberta 
Edmonton, Alberta 
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CORMAT 
Syntax 
CORMAT DATA 
Description 
CORMAT generates a labelled matrix of simple correlation coefficients for the input matrix DATA, 
Arguments 
DATA - A matrix with the rows being observations and the columns being variates. 
Output 
A labelled correlation matrix is produced, showing correlation coefficients for all pairs of variables. 
Example 
3 37 32 9 
62 14 20 54 
86 46 47 %9 


а5 23 5 60 
72 7% 76 100 


CORMAT DATA 
1 2 3 4 
1| 1.0000 0.3026 0.4122 0.7017 
2| 0.3026 1.0000 0.9465 0.5211 
3| 0.4122 0.9465 1.0000 0.5119 
4| 0.7017 0.5211 0.5114 1.0000 


Notes and Hints 


CORMAT uses the program CM which produces an unlabelled, storable, correlation matrix. If you wish 
10 store the correlation matrix, or process a larger matrix, use CM. (See CMHOW in this workspace.) 
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Reference 

Freund, J.E. Modern Elementary Statistics, Second Edition, Prentice-Hall, 1964, Chapter 5. 
Source 

Based on CM function found in: 

Smillie, K.W. Statpack 2: An APL Statistical Package, Second Edition, Publication No. 17, Universi- 
ty of Alberta, February 1969. 

L. Gibson, S. Maxwell, S. Swaminathan 

Institute of. Computer Science 


University of Guelph 
Guelph, Ontario 
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PARTCORR 
Syntax 
PARTCORR 
Description 
PARTCORR prints а partial correlation coefficient. 
Input 


Inputs requested include a correlation matrix and the numbers of the two variables whose partial 
correlation is desired, as well as the numbers of the variables whose effects are to be eliminated. 


Output 
The resultant partial correlation coefficient is printed. 


Example 


52 6 33 49 
u1 28 5 44 
69 40 29 18 
10 58 60 22 
11 27 65 2 
67 74 47 61 
63 71 16 62 

9 80 72 14 
50 31 53 54 
зо 79 24 4 
70 57 34 46 
43 39 23 45 
15 36 3 68 
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PARTCORR 
ENTER THE NAME OF THE MATRIX OF CORRELATION COEFFICIENTS 


CM РАТА 
ENTER THE NUMBERS OF THE VARIABLES WHOSE PARTIAL 
CORRELATION IS DESIRED 


П: 
13 

ENTER THE NUMBERS OF THE VARIABLES WHOSE EFFECTS ARE TO ВЕ 
REMOVED 

0: 
2% 


PARTIAL CORRELATION OF VARIABLES 1 3 WITH THE EFFECTS 
OF VARIABLES 2 % REMOVED IS ~0.372463158 


Notes and Hints 


The matrix for input to PARTCORR is obtained by using the function CM. If this correlation matrix 
is not already known, type 


CM DATAMATRIX 


where DATAMATRIX is the name of the raw data matrix, when prompted for the correlation matrix. 
(Refer to documentation for CM.) 


Reference 


Merrill, W.C. and K.A. Fox. Introduction to Economic Statistics, John Wiley & Sons Inc., New 
York, 1970, Chapter 10. 
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SCORR 
Syntax 
ReSCORR D 
Description 
The simple correlation coefficient is computed between each distinct pair of variates for all observations 
except those in which either or both observations are missing for the particular pair of variates in 
question. 


Argument 


D - А matrix of observations with the rows corresponding to observations and the columns to 
variates. Missing observations are recorded in D as any negative number. 


Result 

R - A matrix with 4 columns in the following format: 
Col 1: I=Column index of first variate 

Col 2: J=Column index of second variate 

Col 3: Number of observations for variates I and J 


Col 4: Correlation coefficient for variates I and J 


If D has К columns, then R has (Кх(К-1))+2 rows. 


Example 
D 
26 71 16 
24 20 14 
71 36 31 
18 13 1 
71 37 27 
71 28 10 
29 71 33 
34 15 21 
22 11 0 
71 19 12 
U-R-SCORR D 
2 1 u 0.25992613 
3 1 5 70.213084998 
3 2 7 70.0261847113 


1% 


Notes and Hints 


1) Pair-wise deletion of missing observations is used to arrive at the final set of observations used 
in the computations. 


2) Note that adding a constant so that all real data is non-negative will not affect the correlation 
coefficient. 


Reference 


Smillie, К.М. Statpack 2: Ап APL Statistical Package, Publication No. 17, Department of Compu- 
ting Science, University of Alberta, February 1969. 


Source 
Dr. K.W. Smillie 
Department of Computing Science 


University of Alberta 
Edmonton, Alberta 
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32 FILEREG 


FUNCTIONS 
Function 
Header Documentation Description 
REGRESS N REGRESSHOW Performs multiple regression on data 


stored in a file. 
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REGRESS 
Syntax 
REGRESS N 
Description 


REGRESS performs simple or multiple regression analysis on data matrices containing any number of 
observations. REGRESS is particularly designed to handle those situations where there is too much data 
for REGR (in workspace 32 REGRESSION). The input data is stored on file. The regression can be 
performed on transformations of the original input by defining a subfunction TRANSFORM as described 
below. 


Argument 


N - The number of observations upon which analysis is required. If there are М observations in 
the file, and an analysis of М<М is requested, the first W observations of the existing M are 
processed. If W>M, an analysis is performed on the existing and an appropriate message 
follows the output. 


Input 
The following input is requested: 


1) Tie number of the file containing the data. 

2) Column numbers in the data matrix of the dependent and independent variables. These numbers 
must refer to the column numbers in the transformed matrix if any transformations are performed. 
'These may not correspond to those in the original input data matrix. 

3) Whether you wish transformation of the data or not (read how to define the transformation 
subfunction below). 

4) Whether you wish the Durbin-Watson statistic. 


Definition of the Subfunction TRANSFORM 


In order to perform transformations on the original input data, you must define the subfunction 
TRANSFORM. 


This subfunction TRANSFORM must be niladic (that is, have no arguments), and have no explicit result. 
Тһе function must assume that the data component from the original file is called X and must ensure 
that the resulting transformed matrix is also called X. 


For example, suppose that the original data components contain 4 columns and you require a regres- 
sion of column 1 against column 2 and (column 3 times column 4). The necessary function would 
then be: 


VTRANSFORM 
11) XEs3]€XE 3380343 
[2] Хед 14XV 


Remember in this situation that the dependent variable is now in column 1 and the independent 
variables are in columns 2 and 3. 
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Output 


After a reminder to ALIGN PAGE, press the RETURN key to commence output. The following results 


are printed: 


1) The mean of the dependent variable. 


2) The means, estimated coefficients, standard errors and t-values for all independent variables 


{including the intercept). 


3) Ап analysis of variance table including source of variation, degrees of freedom, sum of squares, 


mean square and F-statistic. 


4) Coefficient of multiple correlation «2 (corrected and uncorrected), the F-statistic, the standard 
error of the estimate, the coeffictent of variation (measured in percentage terms at the mean of 


Y), and the Durbin-Watson statistic (if requested). 
Example 


'REGDATA' ОТТЕ 10 


REGRESS 1000 
TIE NUMBER ОР DATA FILE PLEASE 
D: 


10 
DEPENDENT VARIABLE PLEASE (COLUMN NUMBER) 


1 
INDEPENDENT VARIABLE(S) PLEASE (COLUMN WUMBER(S)) 


TRANSFORMATION REQUIRED (YES ОР NO)? 

NO 

DURBIN-WATSON STATISTIC REQUIRED (YES OR NO)? 
#0 


ALIGN PAGE 
MEAN ОР DEPENDENT VARIABLE 59.120000 
VARIABLE MEAN ESTIMATED COEFFICIENT 

CONSTANT TERM 71.31160 

1 386.40000 0.24463 

2 53.72400 70.63457 
SOURCE OF VARIATION DF SUM OF SQUARES 
MEAN 1 3895174.u0000 
REGRESSION 2 11909373.62184 
RESIDUAL 997 5284654. 97816 
TOTAL 1000 20688200.00000 


F-STATISTIC FOR SIGNIFICANCE OF REGRESSION( 2, 997) 1123. 


STANDARD ERROR OF THE ESTIMATE. 72 
MULTIPLE CORRELATION COEFFICIENT o 
CORRECTED R*2 (вғ2).................. 0 
СОЕРРІСТЕНТ OF VARIATION (AT THE MEAN OF D). 123 


140 


STD. ERROR 
2.66532 
0.00553 
0.02591 


MEAN SQUARE 
5954686.81092 
5300. 55364 


4084619061 


- 8049012023 
-6926860329 
-6920294753 
-1476677982 


T-VALUE 

70.49210 

44.21968 
724.49188 


Р-57471871С 


1123.40846 


Notes and Hints 


1) See documentation for the function REG which should always be used rather than REGRESS when 
possible. (REGR resides in the workspace 32 REGRESSION.) 


2) The calculation of the Durbin-Watson statistic requires a second pass through the data; therefore 
request its calculation judiciously. 


3) Each component in the data file must have the same number of columns (representing variables), 
but the number of rows (observations) may vary from component to component. 


Source 
A. North 


ІР. Sharp Associates Limited 
Ottawa, Ontario 
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Function 
Header 


PDLAG 


SHILLER 


32 LAGS 
FUNCTIONS 


Documentation 


PDLAGHOW 


SHILLERHOW 
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Description 


Performs a polynomial distributed lag 
analysis. 


Performs Shiller lag analysis. 


PDLAG 
Syntax 
PDLAG 
Description 


The function PDLAG performs a polynomial distributed lag analysis for a user-defined econometric 
model. 


Input 
Data is requested in the following order: 


1) The dependent variable. 

2) Independent variables (unlagged): 
A matrix of independent variables upon which no lag scheme is to be imposed, with each variable 
in a column. А vector is allowed if there is only one variable. Enter if there are no independent 
variables which are not to be lagged. 

3) Independent variables (lagged): 
These variables are to be entered one at a time using a statement of the form 


V LAG X 


where X is the independent variable on which is imposed the lag scheme defined in V. V is a vector 
with: 


V[1] - Degree of hypothesized polynomial 
VE2] - Beginning of lag period 

угз) - End of lag period 

VCs] - Near end zero restriction (0 or 1) 
V5] - Far end zero restriction (0 or 1) 


Any number of lag schemes may be imposed upon any number of independent variables. 
Output 

The output includes the usual least squares analysis, showing: 

1) Estimated coefficients, corresponding standard errors and t-values 

2) R-squared and R-bar-squared 

3) F-statistic 

4) Standard error of the estimate 

5) Coefficient of variation 

6) Durbin-Watson statistic 


Also printed is a distributed lag interpretation for each lagged independent variable, including: 


1) The unscrambled lag coefficients 
2) Corresponding standard errors and t-values 
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Example 


60 1рЕХР 60 1pVALUE 60 1pAPPA 60 1pAPPB 
1316 50453 7:2 0 
7557 51630 9.8 3236 
4587 31904 5.7 3727 
5328 98665 9.4 2607 
2190 49398 2.8 861 

471 26615 0.6 2719 
6789 9074 7.1 3636 
6793 94777 7.6 1001 
9347 7375 6.3 Зууц 
3836 50071 1.1 1886 
5195 38415 3.1 2024 
8310 27709 0 2402 

346 91382 9.5 3271 

535 52975 5.3 3024 
5298 46445 8.6 1849 
6712 94098 4.3 3806 

77 5009 9.3 2531 
3835 76152 3.5 1758 

669 77021 1.3 3299 
4175 82782 3.3 2756 
6868 12537 0.5 2808 
5890 1587 3.5 0 
9305 68856 6.6 3818 
8462 86825 0.2 3406 
5270 62955 4.5 1158 
920 73623 6.4 2150 
6540 72542 3.2 2058 
4160 99946 5.4 414 
7012 88858 Зе? 1657 
910+ 23320 3.8 2307 
7622 30633 0.3 3507 
2625 35102 0 1761 
475 51328 9.9 2919 
7361 59112 0 3478 
3283 84599 2.5 2863 
6327 41209 5.6 3203 
7565 84152 0 2827 
9911 26932 2.4 2967 
3654 41540 8.1 77 
2471 53731 4.6 3545 
9826 46792 8.9 2100 
7227 28722 3.6 1854 
7534 17833 6.8 261 
6516 15372 5.5 2854 
727 57166 0 1956 
6317 80241 5.9 2671 
8848 3306 9.7 272€ 
2728 53445 6.9 79% 
4365 49849 6.7 3667 
7665 95537 4.4 0 
4778 7u830 7.4 3561 
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2378 55459 


2750 89074 
3593 62485 
1666 84204 
4866 15977 
8977 21276 
9093 71471 

606 13043 
9047 9100 

PDLAG 


ENTER DEPENDENT VARIABLE 
D: 
EXP 


ENTER MATRIX OF INDEPENDENT VARIABLES WHICH ARE NOT TO BE 


LAGGED (0 IF NONE) 
П: 
60 1pVALUE 


қомы ок о ғол 


ENTER HYPOTHESIZED LAG DISTRIBUTION AND INDEPENDENT 


VARIABLE 
0: 
203 0 0 LAG APPA 


ANOTHER INDEPENDENT VARIABLE TO BE LAGGED? 


YES 


ENTER HYPOTHESIZED LAG DISTRIBUTION AND INDEPENDENT 


VARIABLE 
0: 
3 2 6 0 0 LAG APPB 


ANOTHER INDEPENDENT VARIABLE TO BE LAGGED? 


NO 
ALIGN PAPER 


ORDINARY LEAST SQUARES 


ESTIMATED STANDARD 
COEFFICIENT ERROR 
CONSTANT 6156.43242 2341.57426 
X4 70.02703 0.01436 
A1 794.78160 137.79753 
А? 111.54992 215.75225 
43 732.36098 66.86106 
Au “2.28405 4.61754 
45 1.16084 1.03772 
A6 7 0.19126 1.07801 
A7 0.01351 0.08941 
R-SQUARED 
RBAR-SQUARED........ 


F-STATISTIC ( 8, 45).. 
STANDARD ERROR OF THE ESTIMATE 
COEFFICIENT OF VARIATION...... 
DURBIN-WATSON STATISTIC....... 
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T-VALUE 


2. 


En 
70. 
0. 
0. 
70. 
0. 
70. 
0. 


62919 
88241 
68783 
51703 
48400 
49465 
28750 
17742 
15107 


-16159 
.0125u 
.08417 


75267 
09911 


.1262u 


DISTRIBUTED LAG INTERPRETATION : LAGGED VARIABLE 1 


LAG COEFFICIENT STD. ERROR T-VALUE 
о 794.78160 118.66762 70.79871 
1 715.59266 74.04222 0.21059 
2 71.12567 75.07041 70.01480 
3 751.38063 113.14059 70.45413 


DISTRIBUTED LAG INTERPRETATION : LAGGED VARIABLE 2 


LAG COEFFICIENT STD. ERROR T-VALUE 
2 70.61936 0.35972 71.72180 
3 70.15819 0.30671 70.51577 
4 0.16358 0.19969 0.81917 
5 0.42699 0.30375 1.40574 
6 0.71309 0.33725 2.11440 


Notes and Hints 
If any of the variables entered is of an inconsistent length, an error message is printed. 
References 


Almon, S. “The Distributed Lag Between Capital Appropriations and Expenditures” Econometrica, 
January 1965. 


Cooper, J.P. "Two Approaches to Polynomial Distributed Lags Estimation; An Expository Note and 
Comment", The American Statistician, June 1972. 


Foot, D.K. and A. North. The Use and Misuse of Econometrics (With Reference to SHARP APL), 
Second Edition, University of Toronto and LP. Sharp Associates Limited, 1977. 


Source 
А. North 


Т.Р. Sharp Associates Limited 
Ottawa, Ontario 
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SHILLER 
Syntax 
SHILLER 
Description 


The function SHILLER performs the analysis of a distributed lag model, having created augmented 
data matrices based upon user-defined smoothness priors. 


Input 
Data is requested in the following order: 
1) The dependent variable. 


2) Independent variables (unlagged). Expected input is a matrix of independent variables upon 
which no lag scheme is to be imposed, each column of the matrix containing a variable. Vector 
input is sufficient if only one such variable is to be included in the regression specification. If 
no such variables are to be included, a response of 0 is expected. 


3) Whether or not a constant term is to be included in the model specification. A response of yes 
or no is expected. 


4) The independent variables upon which distributed lag schemes are to be imposed. These variables 
are to be entered one at a time using a statement of the form 


V SLAG X 


where X is the independent variable upon which is imposed the lag scheme defined in V. V is 
a vector of 5 elements consisting of 


V[1} - degree of smoothness priors (D) 

V[2] - number of lags (L) 

УСЗ] - 1 if smoothness priors are to include a zero coefficient for the variable lagged 1 (“head 
constraint"), 0 otherwise. 

УСЫЈ - 1 if smoothness priors аге to include a zero coefficient for the variable lagged Z (“ай 
constraint"), 0 otherwise. 

7.51 - tightness parameter (К). 


Various checks are performed as to the validity of the elements of V. Any inadmissibilities are 
rejected and an appropriate error message printed. All variables entered must be of the same 
length. Any number of lag schemes may be imposed on any number of variables. 


Output 

A preliminary ordinary least squares analysis is performed and the resulting R-squared and Durbin- 
Watson statistic printed. The Shiller lag analysis follows with each of the tightness priors K displayed. 
Output includes: 

1) Estimated coefficients, standard errors and t-values for all variables, including the constant term 


and unlagged independent variables, where applicable. 
2) R-squared 
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3) Corrected R-squared 

4) Тһе overall F-statistic with its degrees of freedom 

5) Standard error of the estimate 

6) Coefficient of variation (measured in percentage terms at the mean of Y) 
7) Durbin-Watson statistic 


Example 


Taken from Foot and North, page 202. See References. 


82 48pCON,PDI 


38776 40704.5046 

38664 40767.20943 
39492 41534.2592 

39940 42161.09729 
40460 43437.61858 
41352 43520.91437 
41876 45049.83207 
42736 45767.74898 
43316 46754.15941 
43176 47014 .18374 
44180 47546.05015 
yuyo 48090.16047 
44996 48301.23871 
45868 49906.0823 

45716 49570.9067 

46872 50138.56216 
47312 49907.43688 
47720 51534.41395 
48576 52128.84188 
48944 52696.89457 
50116 53533.58843 
49828 53219.36772 
50368 54555.13282 
51100 54599.94279 
50844 §5201.82111 
50936 54041 .50438 
51932 55774.87806 
52392 56170.93487 
53528 58295.7746 

55268 59368.02383 
56305 60351.32305 
57364 61766 .27997 
57900 63619.24245 
59440 65314.6359 

60232 65299.49495 
61792 67808.02157 
62976 68981.67304 
63276 71320.14329 
53908 71727.04027 
65356 73680.03565 
66768 75520.18188 
67412 75360.35426 
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68084 77578 .08959 


67164 77339.24274 

68748 78538.9796% 

69872 81268.44877 

71696 82116.94702 

72820 82455.11563 
SHILLER 


ENTER DEPENDENT VARIABLE 


CON 
ENTER MATRIX OF INDEPENDENT VARIABLES WHICH ARE NOT ТО BE LAGGED (0 IF NONE) 
D: 


0 
INCLUDE CONSTANT TERM (YES OR NO)? 
YES 
ENTER HYPOTHESIZED LAG SCHEME AND INDEPENDENT VARIABLE 
D: 
1301 1964.8 SLAG PDI 
ANOTHER INDEPENDENT VARIABLE TO BE LAGGED? 
NO 


PRELIMINARY OLS ANALYSIS: 
R-SQUARED.... 0.99404 
DURBIN-WATSON 0.52927 


SHILLER LAG ANALYSIS -- PRIOR STRENGTH(S) - 1964.800 


COEFFICIENT STD. ERROR T-VALUE 

CONSTANT 6741.8442 590.9572 11.4083 

LAG: 0 0.6036 0.1250 4.8279 

1 0.2535 0.1080 2.3469 

2 0.0180 0.1051 0.1731 

3 70.0754 0.1018 70.7405 
R-SQUARED 0.99833 
RBAR-SQUARED... i 0.99818 
P-STATISTIC ( 4, uu).. қ 6565.16281 
STANDARD ERROR OF THE ESTIMATE 752.18054 
COEFFICIENT OF VARIATION.. 1.51369 
DURBIN-WATSON STATISTIC... 0.51032 
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Notes and Hints 


1) If the tightness prior К is very large, the estimated coefficients and their standard errors, as well 
as such equation statistics as R-squared and the Durbin-Watson, will approximate those of the 
polynomial distributed lag analysis performed by the function PDLAG in this workspace (given 
the same degree and head and tail constraints). If K is too large, singularity problems may result 
in the message DOMAIN ERROR. 


2) Тһе degree D of smoothness priors is usually chosen to be lower than is customary with a 
polynomial distributed lag analysis. Values of 0 or 1 are most common. 


3) The “tightness” parameter К may be any non-negative number. If К=0, the resulting coefficients 
are simply the ordinary least squares estimates. A value for K must be chosen by the user. The 
originator of the technique, R.J. Shiller, suggests that, for D=1, 

K-(SIGMAxL*2)*8xSUM 


should be a reasonable value for K, where SIGMA is the standard error of the regression and 
SUM the sum of the distributed lag coefficients, wherever these may be known. L is as above. 


References 


Foot, D.K. and A. North. The Use and Misuse of Econometrics (With Reference to SHARP APL), 
Second Edition, University of Toronto and І.Р. Sharp Associates Limited, 1977. 


Shiller, R.J. *A Distributed Lag Estimator Derived From Smoothness Priors", Econometrica, 41, 
1973, pp. 775-788. 


Source 
A. North 


LP. Sharp Associates Limited 
Ottawa, Ontario 
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32 NONLINEAR 


FUNCTIONS 

Function 

Header Documentation Description 

Y GOMPERTZP X GOMPERTZPHOW Estimates the parameters К, А and B of 
the Gompertz equation. 

ReY MARQUARDT X MARQU ARDTHOW Non-linear regression (non- 
conversational, explicit result). 

Y MARQUARDTP X MARQU ARDTPHOW Non-linear regression (conversational, 


detailed formatted output). 
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GOMPERTZP 
Syntax 
Y GOMPERTZP X 
Description 
Given two sets of data points X and Y, GOMPERTZP calculates estimates of the parameters X, 4, and 


B of the Gompertz equation Y-Kx4*B«X. The iterative technique due to Marquardt, with initial 
parameter estimates obtained by Kenney's method, is used to minimize the residual sum of squares. 


Arguments 
X - A vector of the observations on the independent variable. X must have 4 or more elements. 
Y - A vector of the observations on the dependent variable. Y must have the same number of 


elements as X. 
Output 


The initial parameter estimates and initial residual sum of squares will be printed by GOMPERTZP. 
Subsequently, if solutions for К, А, and В are found, the following will be printed: 


1) A table of the estimated values of the parameters, as well as their сс responding standard errors 
and t-values. 

2) The correlation matrix of the estimated parameters. 

3) Тһе variance of the residuals, R-squared, R-bar-squared, and the Durbin-Watson statistic. 

4) A table of X, Y, the estimated Y, and the residuals. 


Example 


Taken from page 18 of Richards. See References. The X’s represent the years from 1948-1968, the 
Y's are the initial established reserves of natural gas in Alberta. 


3962 4543 5123 7295 10413 13115 15202 17307 19598 
21962 25822 28963 33220 33691 35456 36727 
39776 42960 44403 47025 51803 


Y GOMPERTZP X 
Ү(К,А,В;Х] = є + KxAxBaX 
INITIAL PARAMETER ESTIMATES 


K= 63128.120010 
A= 0.052297 
B= 0.887798 


INITIAL SUM OF SQUARES: 27846694.23 
SOLUTIONS ARE FOUND AFTER 4 ITERATIONS. 
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APPROXIMATE STATISTICS FROM LINEAR REGRESSION THEORY 


EST. VALUE STD. ERR. T-VALUE 
K 69555.077389 4535.501459 15.335697 
A 0.051638 0.004355 11.857144 
B 0.897881 0.007662 117.182571 


CORRELATION MATRIX OF THE ESTIMATED PARAMETERS 
1.000000 0.263960 0.961994 
0.263960 1.000000 0.492184 
0.961994 0.492184 1.000000 


1302528. 662641 


0.994943 

R-BAR SQUARED... TAR 0.994381 

DURBIN-WATSON STATISTIC...... 1.056564 

TABLE OF RESIDUALS 

b OBSERVED Y ESTIMATED Y RESIDUAL 
0.000000 3962.000000 3591.664767 370.335233 
1.000000 4543.000000 4861.010815 7318.010815 
2.000000 5123.000000 6378.752801 71255.752801 
3.000000 7295.000000 8141.303637 7846.303637 
4.000000 10413.000000  10135.18703u 277.812966 
5.000000 13115.000000 12338.267577 776.732423 
6.000000 15202.000000 14721.543005 480.456995 
7.000000 17307.000000 = 17251. 228333 55.771667 
8.000000 19598.000000  19890.884840 7292.884840 
9.000000 21962.000000 22803.398172 7641.398172 
10.000000  25822.000000 25352.6725н% 469.327456 
11.000000 28963.000000 28104.969293 858.030707 
12.000000 33220.000000 30829.869898 2390.130102 
13.000000 33691.000000 33500.882168 190.117832 
14.000000 35456.000000 36095.733281 7639.733281 
15.000000  36727.000000 38596.406379 ^ 1869.506379 
16.000000 39776.000000 50988.981019 71212.981019 
17.000000  52960.000000 %3263.331969 7303.334969 
18.000000  44403.000000  45412.757939 ^ 1009.757939 
19.000000 4_7025.000000 47433.519158 7408.512158 
20.000000 51803.000000  49324.421478 2478.578522 
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Notes and Hints 
1) Тһе iteration within GOMPERTZP may be terminated in one of the following 3 ways: 
a) The relative change in each parameter estimate is less than .00001. 
b) The relative change in the residual sum of squares is less than .00001. 
c) Failure to converge after a preset maximum number of iterations. The maximum number of 
iterations performed by GOMPERTZP is governed by the global variable LIMIT, with a default 
value of 20. 


2) GOMPERTZP is designed 10 handle data which does in fact roughly follow the Gompertz model. 
It may misbehave badly if alien sets of data are used. 


References 


Foot, D.K. and A. North. The Use and Misuse of Econometrics (With Reference to SHARP APL), 
Second Edition, University of Toronto and І.Р. Sharp Associates Limited, 1977. 


Kenney, J.F. Mathematics of Statistics, D. Van Nostrand Company, New York, 1947. 


Marquardt, D.L. “Ап Algorithm for Least Squares Estimation of Non-Linear Parameters", Journal 
of the Society of Industrial and Applied Mathematics, Volume 2, 1963, pp. 431-441. 


Richards, R. A Comparison of Methods of Evaluation of the Gompertz Curve, The National 
Energy Board of Canada, 1972. 


Source 
R. Hui 


ІР. Sharp Associates Limited 
Calgary, Alberta 
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MARQUARDT 
Syntax 
ВУ MARQUARDT X 
Description 
MARQUARDT provides, by an iterative technique, least squares estimates of parameters entering non- 
linearly into a mathematical model. The user specifies the model by defining a function called АРУ 
which computes the actual function values. 
Definition of the Subfunction АРУ 
See the discussion under the same heading in the documentation for MARQUARDTP in this workspace. 
Arguments 


Y - A vector of the N observations on the dependent variable. 
X - A matrix of the N sets of observations on the M independent variables. X should be an 
M-column matrix. 


In addition, the global variable INITIAL, a vector of the initial estimates of the P parameters, and 
LIMIT, the maximum number of iterations, need to be set before MARQUARDT can be run. 


Result 


В - A vector of the P final values of the parameters. 


Example 
Taken from Keeping, page 354. See References. 


INITIAL +.725 4 


E 

2.138 3.421 3.597 4.34 4.882 5.66 
T 

1.309 1.571 1.49 1.565 1.611 1.68 
VAFN(OIV 
V RA АРМ X 

[1]  ReAC 1)xX«4(2] 


v 


E MARQUARDT T 
0.7689146977 3.860256623 
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Notes and Hints 
1) The iteration is stopped by one of three methods: 


a) If the relative change in each parameter is less than 0.00001. 
b) If the relative change in the sum of squares is less than 0.00001. 
c) И the specified maximum number of iterations has been executed. 


2) It is very important to have reasonable initial estimates for the parameters. With good initial 
estimates, aside from computation time being decreased, MARQUARDT would converge to more 
reasonable final values for the parameters (in general non-linear least squares estimates are not 
unique). 

3) MARQUARDT does not check for invalid data, supply information on each iteration, or provide the 
final statistical output which indicates the accuracy and degree of usefulness of the final parameter 
values and the appropriateness of the hypothesized model. Therefore, some effort should be 


expended to validate the statistical soundness of the final parameter values before they аге used. 
To obtain the full benefit of Marquardt’s algorithm, the function MARQUARDTP should be used. 


Methodology 


An iterative technique is used; the estimates at each iteration are obtained by a method due to 
Marquardt which combines the Gauss linearization method and the method of steepest descent. 


References 
Booth, G.W., G.E.P. Box, М.Е. Muller and T.I. Peterson. Forecasting by Generalized Regression 
Methods, Non-Linear Estimation, (Princeton - IBM), International Business Machines Corporation, 


New York, 1959. 


Foot, D.K. and A. North. The Use and Misuse of Econometrics (With Reference to SHARP APL), 
Second Edition, University of Toronto and 1.Р. Sharp Associates Limited, 1977. 


Keeping, E.S. Introduction to Statistical Inference, D. Van Nostrand Company, Inc., Princeton, 
1962. 


Marquardt, D.L. “Ап Algorithm for Least Squares Estimation оГ Non-Linear Parameters", Journal 
of the Society of Industrial and Applied Mathematics, Volume 2, 1963, pp. 431-441. 


Source 
A. North 
Т.Р. Sharp Associates Limited 


Ottawa, Ontario. 


(Modified by R. Hui, LP. Sharp Associates Limited) 
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MARQUARDTP 
Syntax 
Y MARQUARDTP X 
Description 
The function MARQUARDTP provides, by an iterative technique, least squares estimates of parameters 
entering non-linearly into a mathematical model. The user specifies the model by defining a function 
called АРУ which computes the actual function values. 


Definition of the Subfunction АРУ 


Prior to executing MARQUARDTP, the user must specify the model by writing the subfunction АҒУ, For 
example, to define the model Y equals A times e (2.71828...) to the power B times X, the user would 


туре 


V RA AFN X 
(11  R«4[1]xC* (AE 2]xX)) 
12] 9 


In this case, 4[1] and АГ21 are the parameters to be estimated, with A[1] equivalent to А and 
АГ21 equivalent to В (any further parameters would have to be identified as 4[3], A[4], ... , 
АСКЛ), and X represents the independent variable which will be provided as an argument. 


Arguments 


Y - The dependent variable (a vector or a 1-column matrix). 
X - The independent variable(s). If there is only 1 independent variable, X may be either a vector 
or a 1-column matrix; if there are K independent variables, X should be a K-column matrix. 


State Setting Functions 


Before MARQUARDTP can be executed, certain subsidiary arguments need to be specified. These sub- 
sidiary arguments are defined by a group of state setting functions. The functions, with the default 
values (if any) in square brackets, are 


STATE 
Displays the current state. 


DEFAULT 
Sets the state to the default values. 


INITIAL ESTIMATES X 
Sets the initial estimates of each parameter. 


LOWER LIMITS X [LOWER LIMITS ~1E75] 
Sets the lower limits on values of the parameters. Execution of MARQUARDTP terminates if during 
an iteration a parameter estimate becomes less than its lower limit. There may be as many lower 
limits as there are parameters in the model. 
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UPPER LIMITS X UPPER LIMITS 1Е751 
Sets the upper limits on values of the parameters. Execution of MARQUARDTP terminates if during 
an iteration a parameter estimate becomes greater than its upper limit. There may be as many 
upper limits as there are parameters in the model. 


MINIMUM RSQUARED X MINIMUM RSQUARED 1] 
Тһе iteration stops if the calculated R-squared value becomes greater than X. Note that this is 
merely one of the four stopping criteria. 


PROPORTIONS X PROPORTIONS 0.01] 
Sets the proportions to be used in calculating the difference quotients which approximate the 
partial derivatives. These values are multiplied by the parameter estimates to give the denomina- 
tors of the difference equations, and they determine “step size" for the method of steepest descent. 
(A value of .01 for each parameter seems to work reasonably well.) 


MAXIMUM ITERATIONS X MAXIMUM ITERATIONS 20] 
Sets the maximum number of iterations to be executed. 


PRINT EVERY X PRINT EVERY 1] 
Sets the iterations whose results are to be printed. For example, type PRINT EVERY 5 if every 
fifth iteration is to be printed. The results of the final iteration are printed regardless of whether 
it is an iteration whose results would normally be printed. 


Output 


The sum of squares and the eigenvalues of the moment matrix, calculated using the initial parameter 
estimates, are printed. 


For the iterations whose results are printed, the following is provided: 


1) The determinant of the matrix which must be inverted in order to calculate the correction vector. 
This determinant should decrease during successive iterations. A sudden large drop in the value 
of the determinant is due to ill-conditioning, and hints that the results may be inaccurate. 

2) Angle in degrees of scaled units between the correction vector of Gauss linearization and the 
correction vector of steepest descent. 

3) The new parameter estimates. 

4) The new sum of squares. 

5) Тһе new computed value of R-squared, И MINIMUM RSQUARED N for N less than 1 has been set. 

6) Тһе value of lambda. (Used to interpolate between the correction vectors of Gauss linearization 
and of steepest descent.) 


After convergence is achieved, the following statistics are printed: 


1) A table giving, for each parameter, the final estimated value, the standard error, and the computed 
t-value. 

2) Тһе correlation matrix of the estimated parameters. 

3) Тһе variance of the residuals, R-squared, R-bar-squared, and the Durbin-Watson statistic. 

4) A table of residuals. 
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Example 
Taken from Keeping, page 354. See References. 


Xx 
1.309 1.571 1.49 1.565 1.611 1.68 


Y 
2.138 3.421 3.597 4.34% 4.882 5.66 


ТАЕМІП19 
V Вед AFN X 
[1]  ReA[1]xX«4[2] 
V 


INITIAL ESTIMATES .725 u 
MAXIMUM ITERATIONS 10 


STATE 

INITIAL ESTIMATES 0.725 4 
LOWER LIMITS 71Е75 
UPPER LIMITS 1E75 
MINIMUM RSQUARED 1 
PROPORTIONS 0.01 
MAXIMUM ITERATIONS 10 
PRINT EVERY 1 


Y MARQUARDTP X 
PRELIMINARY ANALYSIS 
INITIAL SUM OF SQUARES  1.4721303E 2 
EIGENVALUES OF MOMENT MATRIX 
2.2196981Е2 
3.7162640Е 1 


ITERATION NUMBER 1 


DETERMINANT 2.0207138E 2 


ANGLE IN SCALED COORDINATES 81.30 DEGREES. 


NEW PARAMETER ESTIMATES 


0.763282 

3.874577 
NEW SUM OF SQUARES ц.4702030Е 3 
LAMBDA 1.000008 3 


ITERATION NUMBER 2 


DETERMINANT 1.8790762E 2 


ANGLE IN SCALED COORDINATES 81.94 DEGREES. 


NEW PARAMETER ESTIMATES 
0.768848 
3.860516 


NEW SUM OF SQUARES 4.31741628 3 
LAMBDA 1.00000Е 4 


ITERATION NUMBER 3 
DETERMINANT 1.8654454Е 2 
ANGLE IN SCALED COORDINATES 79.34 DEGREES. 
NEW PARAMETER ESTIMATES 


0.768915 
3.860257 

NEW SUM OF SQUARES 1.3173175E 3 

LAMBDA 1.000002 5 


ITERATION NUMBER 4 
DETERMINANT 3.0186353E0 
ANGLE IN SCALED COORDINATES 0.57 DEGREES. 
МЕМ PARAMETER ESTIMATES 


0.768915 
3.860257 
NEW SUM OF SQUARES 4.3173178Е73 
LAMBDA 1.00000Е0 
ІТЕРАТІОМ STOPS.  RELATIVE CHANGE IN EACH PARAMETER LESS 
THAN 0.00001. 


APPROXIMATE STATISTICS FROM LINEAR THEORY 


EST. PAR. STD. ERR. T-VALUE 
0.768915 0.018153 42.357841 
3.860257 0.050894 75.849003 


CORRELATION MATRIX OF THE ESTIMATED PARAMETERS 
.1.000000 0.990650 
0.990640 1.000000 


VARIANCE OF RESIDUALS ....... 0.001079 
R SQUARED 0.999433 
R-BAR SQUARED . - 0.999291 
DURBIN-WATSON STATISTIC HE 1.948463 
OBSERVED Y ESTIMATED Y RESIDUALS 
2.138000 2.174179 70.036179 
3.421000 3.411191 0.009809 
3.597000 3.584442 0.012558 
4.340000 4.332648 0.007352 
4.882000 4.845294 0.036706 
5.660000 5.696785 70.036785 
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Notes and Hints 
1) Iteration can be stopped by any one of four methods: 


а) If the relative change in each parameter is less than 0.00001. 

b) If the relative change in the sum of squares is less than 0.00001. 

с) И the calculated R-squared value exceeds the specified minimum. 

d) If there has been no convergence after the given maximum number of iterations. In this case, 
you may wish to start again using the final estimated parameter values as initial parameter 
estimates. (These final parameter values are available upon completion of the execution of 
MARQUARDTP in a variable called B.) 


2) It is very important to have reasonable initial estimates for the parameters. With good initial 
estimates, aside from the computation time saved, MARQUARDTP would converge to more reason- 
able final values for the parameters (in general non-linear least squares estimates are not unique). 


3) Тһе variable M is available at the end of execution. This is an Ух2 matrix, where N is the number 
of observations, whose respective columns contain the observed and predicted Y-values. M may 
be useful for subsequent plotting. 


4) For a model having К independent variables, within the model function AFN the i-th independent 
variable should be referred to as the i-th column of X. For example, with a 4 independent variable 
model, the independent variables would be ХГ;1), ХГ;21, X(;3], and ХГ;81. 


5) Тһе ratio of the largest to smallest of the eigenvalues of the moment matrix is an indication of 
the conditioning of the matrix - the larger the ratio, the worse the conditioning. Negative eigen- 
values may occur if the conditioning is extremely bad - possibly resulting from more parameters 
than necessary to explain the data. Removal of parameters may remedy this problem. 


6) Тһе residuals should be examined for patterns in sign or magnitude. MARQUARDTP assumes that 
the errors are independent random variables from a normal distribution with an expected value 
of zero. 

7) If MARQUARDTP terminates due to one or more of the parameters straying beyond the specified 
limits, the final parameter estimates are still available in the variable B. No statistics will be 
printed, however. 


8) For models following Gompertz’ equation, the function GOMPERTZP should be used as it will 
provide better results. 


Methodology 


An iterative technique is used; the estimates at each iteration are obtained by a method due 10 
Marquardt which combines the Gauss linearization method and the method of steepest descent. 
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Source 
A. North 
І.Р. Sharp Associates Limited 


Ottawa, Ontario. 


(Modified by В. Hui, LP. Sharp Associates Limited) 


32 PROBIT 


FUNCTIONS 
Function 
Header Documentation Description 
PROBIT PROBITHOW Performs a probit analysis on quantal data in 


biological assay. 
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PROBIT 
Syntax 
PROBIT 
Description 
This program performs a probit analysis on quantal data in biological assay. 
Input 
The following information is requested conversationally: 
doses, which must be strictly positive 
number of subjects at each dose level 


number of responses at each dose level 
percentage of response at 0 dose level. If this is not available enter 0. 


ONS 


An option is provided to request a plot. 
Output 


A table is printed giving: 


1) dose 

2) log dose 

3) no. of hosts 
4) response 


5) adjusted percent response 
6) empirical probit 

7) expected probit 

8) working probit 

9) weighting factor 

10) NWX 

11) NWY 


Also provided are: 

1) the means 

2) the intercept and slope of the probit line, along with the variance and standard error of the slope 
3) the ED50 along with its variance, standard error and confidence limits 

4) chi-square statistics 


Example 


Taken from Finney, page 52. See Reference. 


PROBIT 
ENTER DOSES. 
0 
2.6 3.8 5.1 7.7 10.2 
ENTER NUMBER ОР HOSTS AT EACH DOSE. 
ia} 
50 48 46 49 50 
ENTER NUMBER OP RESPONSES AT BACH DOSE. 


0: 
6 16 24 42 шу 
ENTER PERCENTAGE OF RESPONSES АТ 0 DOSE. (ENTER O IP NO SUCH DATA AVAILABLE.) 
0 
5 
DO YOU WISH А PLOT? 


YES 
ALIGN PAGE 
LOG WO. Add EMP EXP WKING 
„DOSE „DOSE HOST RESP «/*RESP РРОВ PROB PROB WEIGHT NWX BWY 
2.60 0.415 50 6 12.00 23.825 .196 
3.80 0.580 һа 16 33.33 %.570 .871 
5.10 0.708 46 2% 52.17 5.05% .596 
7.70 0.885 "9 42 85.71 6.058 .602 
10.20 1.009 50 44 88.00 6.175 507 
117.370 82.793 596.772 
MEAN OF X = 0.7053975977 MEAN OP Y = 5.084524588 
INTERCEPT = 2.112369971 
SLOPE ОР PROBIT LIRE = 4.213455903 
VARIANCE = 0.228952322 STANDARD ERROR = 0.4784906108 
EDSO VALUE = 4.845481258 
VARIANCE = 0.003333627786 STANDARD ERROR = 0.2857377441 


LIMITS ОР EDSO ARE: 8.599743514 < EDSO $ 5.091219002 


*** — ADJUSTED PROBIT VALUES «аж 


3.861 4,555 5.09% 5.848 6.362 
CHI-SQUARE = 1.708492521 D.P. = 3 
CHI - SQUARR(GOLDSTEIN) = 128.2366554 О.Р. = 3 
т.92 + 
I 
I 
П 
1 
1 
6.62 + 
! 
| 
Р | . 
R І o 
2 І o 
в 5.82 + M 
I 
т 
8 
І . 
1 о 
4.62 + 
1 
1 P 
i 
! 
| 
3.82 е -ж----%----ж----4----. 
9.% 0.8 1.2 
LOG OF DOSES 
о - EMPIRICAL PROBITS + - ADJUSTED РЕОВІТ5 
DO YOU WISH TO ІТЕРАТЕ ONCE MORE? 
по 
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Notes and Hints 


1) Due to computational considerations, a 100% response results in an empirical probit value of 
9; а 0% response results іп an empirical probit value of 1. 


2) Тһе "adjusted percentage response" is the percentage response adjusted by Abbott's formula. 

3) Note that if the base values are less than 1 they can easily be transformed to values greater than 
1 by means of multiplication by an appropriate factor. This has the same effect as if the bases 
had been measured in a smaller unit. 

Reference 

Finney, D.J. Probit Analysis, Cambridge Press, Cambridge, 1971. 

Source 

J.C. Douglas 

Department of Microbiology 

University of Guelph 

Guelph, Ontario 


(Modified by L. Gibson, LP. Sharp Associates Limited) 
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32 REGRESSION 
FUNCTIONS 


Function 
Header Documentation Description 


Y COCHRANENORCUTT X COCHRANEAORCUTTHOW Performs autocorrelation adjustment 
according to the Cochrane-Orcutt 
procedure. 


CONVERT N CONVERTHOW To be used immediately following 
regression analysis to convert back from 
logged units to original units of the 
dependent variable. 


Y GLS X GLSHOW Generalized least squares analysis. 


Y HILDRETHALU X HILDRETHALUHOW Uses the method of Hildreth-Lu for 
regression analysis where there are 
problems with autocorrelation. 


Y INSTRUMENTAL X INSTRUMENTALHOW Uses instrumental variables as а me- 
thod of estimating parameters in an 
equation. 


PREDICT PREDICTHOW Provides point estimates and confidence 
intervals for a dependent variable given 
values for the independent variables. 


Y REGR X REGRHOW Simple or multiple regression analysis 
using ordinary least squares. 


Y STAGES X STAGE3HOW Two- and three-stage least squares 


analyses of a set of simultaneous linear 
equations. 
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COCHRANEA ORCUTT 
Syntax 
Y COCHRANENORCUTT X 
Description 
The function COCHRANEAORCUTT uses the Cochrane-Orcutt procedure to perform a regression analysis 
when it is known or suspected that the residuals are autocorrelated. The final regression calculations 
are performed by the function REGR, so that additional output options may be obtained by means of 
state setting functions (see documentation of REGR in this workspace). 
Arguments 
Y - A vector of length М representing the dependent variable. 
X - A matrix with М rows whose columns contain the independent variables. 
Input 


The user is asked to specify the maximum number of iterations after which iteration is to cease, 
regardless of whether convergence has been attained. A limit of 10 or 20 will normally suffice. 


Output 


Full regression output as provided by the function REGR is printed, as well as initial and final estimates 
of p, the coefficient of simple correlation between the residuals. 


Example 
Taken from Foot and North, page 132. See References. 


%2 u8pCON,PDI 


38776 40704.5046 

38565 %0767.20993 
39492 41534. 2592 

39940 %2161.09729 
40460 43%37.61858 
41352 43520.91437 
41876 45049.83207 
%2736 45767.74898 
43316 46754-15941 
43176 47014.18374 
44180 47546.05015 
цащо 98090.16047 
44996 48301.23871 
45868 49906.0823 

45716 49570.9067 

46872 50138.56216 
47312 49907, 43688 
47720 51634.41395 
48576 52128.84188 
agguy 52696.89557 
50116 53533.58843 
49828 53219.36772 
50368 54555.13282 
51100 54599.96279 
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50848 55201.82111 


50936 5%4041.50%38 
51932 55774.87806 
52392 56170.93487 
53528 58295.7746 

55268 59368. 02383 
56304 60351.32305 
57364 61766.27997 
57900 63619.24245 
59440 55314.6359 

60232 65299.49495 
61792 67808.02157 
62976 68981.6730% 
63276 71320.1%329 
63908 71727.0%027 
65356 73680.03464 
66768 75520.18188 
57412 75360.35426 
68084 77578.08959 
67164 77339.24274 
68748 78538.97964 
69872 81268.00877 
71695 82116.94702 
72820 82%55.11563 


CON COCHRANEBORCUTT PDI 
MAXIMUM NUMBER OF ITERATIONS 10 


++ CONVERGENCE OBTAINED AFTER 2 ITERATIONS яз 
INITIAL ESTIMATE OF p 0.6%65951281 

FINAL ESTIMATE OF p 0.6681208293 

ALIGN PAPER 


CORRELATION MATRIX (WITH T-VALUES) 


1.00000 0.98623 
39.99794 1.00000 
MEAN OF DEPENDENT VARIABLE 18229.31%35 
VARIABLE MEAR ESTIMATED COEFFICIENT STD. ERROR 
CONSTANT TERM 2442.38359 403.65560 
1 20016.4367% 0.78870 0.01972 
SOURCE OF VARIATION DF SUM OF SQUARES MEAN SQUARE 
MEAN 1 15618471374. 6561008 
REGRESSOR:X i 1 537987377.59205 537987377.59205 
RESIDUAL 45 15132%52.93126 336276.72069 
TOTAL 47 16171591204.63339 


MULTIPLE CORRELATION COEFFICIENT (R*2).. 
CORRECTED R*2 (g*2). 

P-STATISTIC FOR SIGRIPICANCE oF 
STANDARD ERROR OF THE ESTIMATE... 579.8937149982 
DURBIN-WATSON 5Т471871С............ 2.3570908608 
COEFFICIENT OF VARIATION (АТ TRE MEAN “OP nis ) 3.1811054653 


0.9726416382 
0.97203367%6 
1599.835327526% 


же NOTE THAT CONSTANT TERM IS a(1-p) ++ 
ORIGINAL CONSTANT TERM IS 7359.255418 
STANDARD ERROR 1216.272786 
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Т-ҒАШЕ 
5.05066 
39.9979% 


F-STATISTIC 


1599.83533 


Notes and Hints 


1) Тһе estimate of the intercept is equal to а(1-р); thus an estimate of a can be obtained by dividing 
this intercept by 1-р, where p is the final estimate of p provided in the output. The original 
intercept а and its standard error are printed as additional outputs. 


2) Note that there is no guarantee that the final estimate of p will be the "optimal" estimate, in 
the sense of minimizing the residual sum of squares. This difficulty arises because the iterative 
technique may lead to a local rather than a global minimum. 


3) Note also that all equation statistics (such as R-squared) printed in the output reflect variation 
explained in the transformed dependent variable, rather than in the original Y-values themselves. 


References 
Cochrane, D. and С.Н. Orcutt. “Application of Least Squares Regression to Relationships Containing 
Autocorrelated Error Terms", Journal of the American Statistical Association, Vol. 44, 1949, pp. 


32-61. 


Foot, D.K. and A. North. The Use and Misuse of Econometrics (With Reference to SHARP 
APL), Second Edition, University of Toronto and I.P. Sharp Associates Limited, 1977. 


Source 
A. North 


LP. Sharp Associates Limited 
Ottawa, Ontario 
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CONVERT 
Syntax 
CONVERT N 
Description 


The function CONVERT, when executed immediately following an execution of the function REGR, 
reproduces in terms of the original units of the dependent variable all equation statistics which were 
calculated by REGR in terms of logged units of the dependent variable. Thus, the function may be 
executed. whenever (log Y) has been included in a specification as the dependent variable. 


Arguments 


N - The base to which logs were taken when defining the specification to be estimated. Common 
values of W will be 10 for base 10 logarithms and *1 for natural logarithms. 


Output 


Seven equation statistics which either are measured in the units of the dependent variable or depend 
on the dependent variable for their calculation are printed. These include: 


1) the mean of the dependent variable. 

2) the coefficient of multiple determination (R-squared). 

3) R-squared adjusted for degrees of freedom. 

4) the F-statistic for testing the significance of the overall regression (accompanied by the appropriate 
degrees of freedom). 

5) the standard error of the estimate. 

6) the coefficient of variation (measured in percentage terms at the mean of the dependent variable). 

7) the Durbin-Watson statistic. 


Example 
'The accompanying example was executed following an analysis using (In AF) as the dependent 


variable and (In POP) and (а GNP) as independent variables. The function REGR was used; the 
data is simply the natural logarithm of the data presented in the REGR example. 


CONVERT *1 
MEAN OF DEPENDENT VARIABLE.... 331.91525 
R-SQUARED.... t 0.60558 
RBAR-SQUARED m 0.59045 
F-STATISTIC( 2, 56).......... 42.81012 
STANDARD ERROR OF THE ESTIMATE 412.11501 
COEFFICIENT OF VARIATION...... 124.16272 
DURBIN-WATSON STATISTIC....... 1.81747 


Restrictions 
Since a number of global variables which are generated by the function REGE are required to perform 


the necessary calculations, the function CONVERT must be executed only immediately following the 
execution of REGR. 
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Methodology 

The calculated Y-values and the estimated residuals, available as global variables in the х1 matrices 
YHAT and AE respectively, are added together, resulting in the original observations on the dependent 
variable. The antilogs of these values are then used as the basis for the calculation of the required 
equation statistics. 


References 


Draper, N.R. and Н. Smith. Applied Regression Analysis, John Wiley & Sons Inc., New York, 
1966. 


Foot, О.К. and A. North. The Use and Misuse of Econometrics (With Reference to SHARP APL), 
Second Edition, University of Toronto and ТР. Sharp Associates Limited, 1977. 


Source 
A. North 


І.Р. Sharp Associates Limited 
Ottawa, Ontario 
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GLS 

Syntax 

Y GLS X 

Description 

Тһе function 025 performs a generalized least squares analysis. The technique provides best (mi- 

nimum variance) parameter estimates, which an ordinary least squares analysis (using REGR) might 

be unable to do due to the existence of, for example, heteroscedasticity or serial correlation. Optional 
outputs and the exclusion of the constant term are controlled by state setting functions (see documenta- 
tion of REGR). 

Arguments 

Y - A vector of № observations representing the dependent variable. 

X - A matrix with М rows whose columns contain the independent variables. A column of ones 
need not be inserted to represent the constant term; this is controlled by the (default) state 
setting CONSTANT. 

Input 

The user is asked to specify whether or not the scalar covariance matrix of the disturbance term is 

known. Normally this will be unknown and calculated by the function, the calculation being dependent 

upon a parameter p (where 1<р<1) which measures the extent to which successive disturbances are 
correlated with one another. An estimate of p is requested; if it is unknown, the user should enter 

10, and p will be estimated by the function. In the relatively unusual event that the covariance matrix 

is known, its input is requested 


Output 


Full regression output as provided by the function REGR is provided, as well as an estimate of p if 
this parameter was calculated by the function. 
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Example 


Такеп from Foot and North, page 143. See References. The data is displayed in the example of the 


use of the function COCERANEAORCUTT. 


CON GLS PDI 
COVARIANCE MATRIX KNOWN (YES OR NO)? NO 
ERTER p (10 IP UNKNOWN) .65 
ALIGN PAPER 


CORRELATION MATRIX (WITH T-VALUES) 


1.00000 0.99504 
67.81652 1.00000 
MEAN OF DEPENDENT VARIABLE 28696.91345 
VARIABLE HEAR ESTIMATED COEFPICIENT STD. ERROR 
CONSTANT TERM 6868.34904 1019.26017 
1 26965.92245 0.79597 0.01701 
SOURCE OF VARIATION DF SUM ОР SQUARES MEAN SQUARE 
MEAN 29277001628.36993 
REGRESSOR:X 1 1 3212060982,93429 3212060982.93429 
RESIDUAL 45 25539181.80406 576938.73487 
TOTAL ЫЈ 32515601793.10827 


MULTIPLE CORRELATION COEFPICIENT (R92) 
CORRECTED R*2 (6%2)............. 


F-STATISTIC FOR SIGNIPICANCE oF REGRESSION i, 5) 


STANDARD ERROR OF THE ESTIMATE.. 
DURBIN-WATSON STATISTIC. 


COEPPICIENT OF VARIATION па THE MEAN OF Dii) 


Notes and Hints 


0. 9918053540 
0.9916272095 
5567.4212681407 
759.5648325657 
2. 3500961554 
3.0755455904 


T-VALUE 

6.73856 

46.80592 
P-STATISTIC 


5567.42127 


1) Any estimate of p which is provided by the user must be less than one in absolute value. If this 
is not the case, an error message is printed and the input of р requested again. 


2) Тһе covariance matrix, if supplied by the user, must be symmetric. positive definite and of size 
(xv), where ¥ is the number of observations. Violation of any of these criteria results in the 


printing of an appropriate error message. 


References 


Foot, D.K. and A. North. The Use and Misuse of Econometrics (With Reference to SHARP 
APL), Second Edition, University of Toronto and I.P. Sharp Associates Limited, 1977. 


Theil, H. Principles of Econometrics, John Wiley & Sons Inc., 


Source 


А. North 
I.P. Sharp Associates Limited 
Ottawa, Ontario 
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New York, 1971. 


HILDRETHALU 
Syntax 
Y HILDRETHALU X 
Description 
The function HILDRETHALU uses the technique of Hildreth-Lu to perform a regression analysis when 
it is known or suspected that the residuals are autocorrelated. The final regression calculations are 
performed by the function REGR, so that additional output options may be obtained by means of state 
setting functions (see the documentation of REGR). 


Arguments 


Y  - A vector of length N representing the dependent variable. 
X - А matrix with ¥ rows whose columns contain the independent variables. 


Input 

The user is asked to enter the Durbin-Watson statistic resulting from the previous regression which 
indicated the existence of possible autocorrelation problems. If this is not known, and autocorrelation 
problems are merely suspected, ^1 should be entered. 

The user must also specify the maximum number of iterations after which iteration is to cease 
(regardless of whether convergence has been attained), and the size of increment which is to be used 
in successive iterations. 


Output 


Full regression output as provided by the function REG is printed, as well as initial and final estimates 
of p (the coefficient of simple correlation between the variables), and an estimated standard error for 


p. 
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Example 


Taken from Foot and North, page 133. See References. The data is displayed in the example of the 


use of the function COCHRANEAORCUTT. 


CON HILDRETHALU PDI 
ENTER DURBIN-WATSON FROM PREVIOUS REGRESSION 
D: 


.7015 
ENTER NUMBER OF ITERATIONS AND SIZE ОР INCREMENT 


50 .01 
же CONVERGENCE OBTAINED APTER 47 ITERATIONS жж 
INITIAL ESTIMATE OP p IS 0.64925 
PINAL ESTIMATE OF p IS 0.569305 
STANDARD ERROR OF p IS 0.1083759038 
ALIGN PAPER 


CORRELATION MATRIX (WITH T-VALUES) 


1.00000 0.98613 
39.86269 1.00000 
MEAN ОР DEPENDENT VARIABLE 18156.8552% 
VARIABLE MEAN ESTIMATED COEFFICII 
CONSTANT TERM 2435.83358 
1 19958.18610 0.78859 
SOURCE OF VARIATION DF SUM OP SQUARES 
MEAR 1 15511627571. 80905 
REGRESSOR:X 1 1 534354188.86103 
RESIDUAL 45 15132422.87500 
TOTAL 87 1606111%183.5%508 


MULTIPLE CORRELATION COEFFICIENT (Я+2) 
CORRECTED R*2 (g»2)...........- ҚАЗА Tig 
P-STATISTIC FOR SIGRIPICANCE ОР REGRESSION( 1, 
STANDARD ERROR OF THE ESTIMATE у 

DURBIN-WATSON STATISTIC.. 
СОВРРТСТЕНТ OF VARIATION 


55) 


жж NOTE THAT CONSTANT TERM IS а(1-р) ** 
ORIGINAL CONSTANT TERM IS 7365.80106 
STANDARD ERROR 1220.440573 


Notes and Hints 


ENT 


STD. ERROR 


403.59350 
0.01978 


MEAN SQUARE 


$34354188.86103 


3362776 .06389 


.8728807979 
.9718588156 


0342675046 
8931486823 


. 3596134689 
-1920392447 


T-VALUE 
6.03536 
39.86269 
F-STATISTIC 


1589.03427 


1) Тһе estimate of the intercept is equal to o (1-0); thus an estimate of a can be obtained by dividing 
this intercept by (1-р), where p is the final estimate of о provided in the output. The original 
intercept о and its standard error are printed as additional outputs. 


2) Since an iterative procedure is used, it is necessary to specify a value by which p is to be 
incremented (or decremented). Note that it is quite possible for there to be no minimum within 
the range 71<р<1, in which case the final estimate of р will be given as 0.95 if 0.05 has been 
defined as the increment, 0.995 if .005 has been defined, etc. This is a meaningless estimate of 


p. 
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References 


Foot, D.K. and A. North. The Use and Misuse of Econometrics (With Reference to SHARP APL), 
Second Edition, University of Toronto and І.Р. Sharp Associates Limited, 1977 


Hildreth, C. and J.Y. Lu. "Demand Relations with Autocorrelated Disturbances", Technical Bulletin 
276, Michigan State University Agricultural Experiment Station, November 1960, p. 14. 


Source 
А. North 


LP. Sharp Associates Limited 
Ottawa, Ontario 
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INSTRUMENTAL 
Syntax 
Y INSTRUMENTAL X 
Description 


Тһе function INSTRUMENTAL performs a regression analysis using instrumental variables substitution. 
Optional output features are controlled by the state setting functions used by REGR in this workspace. 


Arguments 


Y - A vector of length № representing the dependent variable. 

X - A matrix of N rows and 2K columns. The first X columns represent X independent variables, 
the last K columns, the instruments applied to each of these variables. A variable may be its 
own instrument. 


Output 
Тһе following fundamental statistics are printed: 


1) The mean of the dependen: variable. 

Means, estimated coefficients and their standard errors and computed t-values for each indepen- 
dent variable (including the constant term, where applicable). 

3) Such equation statistics as the F-statistic for testing the significance of the overall regression (with 
corresponding degrees of freedom), the standard error of the estimate, the coefficient of multiple 
determination (R-squared), R-squared corrected for degrees of freedom, the Durbin-Watson 
statistic and the coefficient of variation (measured in percentage terms at the mean of the 
dependent variable). 


In addition to this basic output, such optional output tables as an analysis of variance table, a 
correlation matrix, a list of residuals, and the variance-covariance matrix of the estimated coefficients 
are available through the use of state setting functions. These functions are described in the documenta- 
tion of the function REGR in this workspace. 
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Example 


'FS.1' DENT DATA 
109.0 102.3 241.0 
99.6 101.3 232.2 
100.1 99.6 223.8 
98.0 99.1 229.8 
100.0 100.0 232.3 
104.4 103,1 238.4 
109.3 — 104.1 204.8 
115.8 103.0 247.1 
120.1 103.1 
118.2 104.7 
118.5 107.1 252.8 
117.8 108.3 256.6 
119.7 108.7 259.8 
113.7 108.3 259.7 
114.8 109.0 261.5 
122.3 137,9 277.3 
STATE 
CONSTANT 
ANOVA 
CORRELATION 
NORESIDUALS 
NOCOVARIANCE 
COVARIANCE 
WAS NOCOVARIANCE 
RESIDUALS 


WAS NORESIDUALS 


DATAU,1] INSTRUMENTAL DATAU;2 3 4 5] 
ALIGN PAGE 


CORRELATION MATRIX (WITB T-VALUBS) 


1.00000 0.83388 0.90780 
5.65298 1.00000 0.76233 
9.098653 4.40740 1.00000 
MEAN OP DEPENDENT VARIABLE 132.98125 
VARIABLE MEAN ESTIMATED СОЕРРІСІВЕТ STD. ERROR 
CONSTANT TERM 545.72012 116.51976 
1 111.33125 3.27385 1.23832 
2 104.98125 2.99351 2.05784 
SOURCE OF VARIATION oF SUM OF SQUARES MEAN SQUARE 
MEAN 4 262944. 20562 
REGRESSOR:X 1 1 19548.62418 199ча. 62418 
x2 1 1763. 19421 1763.19421 
RESIDUAL 18 1900.27598 1%6.17508 
TOTAL 16 306156.25000 


MULTIPLE CORRELATION СОЕРРІСТЕНТ (R*2)..... 
CORRECTED R*2 (R*2).. sedeva 
P-STATISTIC FOR SIGNIPICANCE OF REGRESSIOR( 2, 
STANDARD ERROR ОР THE ESTIMATE... 
DURBIN-WATSON STATISTIC PENES 
COEFFICIENT OP VARIATION (АТ THE MEAN OF Y)....... 


as 


VARIANCE-COVARIANCE MATRIX ОР ESTIMATED COEFFICIENTS 


13576.8553 53.3528 7136.3265 

63.3628 1.5334 72.2227 

71956,3265 72.2297 4, 2347 

OBSERVED Y CALCULATED Y 

1 81.3000 117.3236 
2 78.5000 83.5596 
3 88.8000 80.1073 
4 91.5000 71.7363 
5 100.0000 80. 9774 
в 111.2000 105.6505 
7 120.7000 123,6940 
8 133.3000 151.6786 
9 139.0000 156.0539 
10 181.1000 154.6239 
11 196, 2008 162.7904 
12 158.5000 169.0312 
13 177.6000 171.5082 
14 170.1000 150.9693 
15 18. 7000 156.3863 
16 205.2000 207.5595 
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0.8162613320 
0.78799384%6 
28.8763313368 
18.1127879092 
0. 7264104069 
13.6205577171 


T-VALUE 
4.68350 
2.62348 
1.45458 


F-STATISTIC 


133.73432 
12.06187 


RESIDUAL 
736.0236 
75.0595 
8.6927 
19.7637 
19.0226 
6.5395 
72.9940 
78.3788 
717.0539 
113.5239 
716.5909 
75.5912 
6.0918 
19.1307 
28.3337 
72.3595 


References 


Foot, О.К. and A. North. The Use and Misuse of Econometrics (With Reference to SHARP APL), 
Second Edition, University of Toronto and I.P. Sharp Associates Limited, 1977. 


Sargan, J.D. “The Estimation of Economic Relationships Using Instrumenia Variables", Econome- 
trika, Volume 2, 1958, pp. 393-415. 


Source 
А. North 


LP. Sharp Associates Limited 
Ottawa, Ontario 
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PREDICT 
Syntax 
PREDICT 
Description 


Тһе function PREDICT provides point estimates and corresponding error bands of values of the 
dependent variable, given values for each of the independent variables. 


Input 


"The user is asked to provide values for each of the independent variables, followed by the appropriate 
t-value. The number of independent variables and the degrees of freedom of the requested t-statistic 
are indicated by the function. 


Output 


The confidence interval for the mean value of the dependent variable Y is printed, followed by the 
confidence interval for a specific value of the dependent variable. Point estimates, and their correspond- 
ing lower and upper confidence limits, are provided for as many predicted values of Y as the number 
of additional observations entered for the independent variables. 


Example 


The accompanying example was executed following the multiple regression analysis of armed forces 
(AF) against population (POP) and gross national product (GNP) using the function REGR. (See the 
example for РЕСЕ.) Additional observations of 231, 187 and 44 have been gathered for the independent 
variable POP, and observations of 45, 65 and 9 for the independent variable GNP. A 90% confidence 
level has been chosen; hence the value 1.297 has been entered (гот the tabulated values of the 
t-statistic. 


PREDICT 
ENTER VALUES FOR EACH OF THE 2 INDEPENDENT VARIABLES. 
EACH COLUMN MUST CONTAIN AN INDEPENDENT VARIABLE IF A MATRIX OF OBSERVATIONS 
IS ENTERED. 
D: 
3 2 p231 45 187 65 44 9 
ENTER T-VALUE FOR 56 DEGREES OF FREEDOM. 
D: 
1.297 


CONFIDENCE INTERVAL FOR MEAN VALUE OF Y: 


LOWER LIMIT POINT ESTIMATE UPPER LIMIT 
750.26935 821.92891 893.58848 
702.87123 761.53021 820.18919 
164.41397 204.59275 244.77153 
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CONFIDENCE INTERVAL FOR Y: 


LOWER LIMIT POINT ESTIMATE UPPER LIMI' 
515.90043 821.92891 11 
761.53021 2084.77 
795.62823 204.59275 504.81373 


Notes and Hints 


1) In order to correctly execute the function PREDICT, you must have available the tabulated values 
of the t-statistic for various confidence levels and various degrees of freedom. 


2) Note that the error band around the point estimate for an individual predicted value of Y is wider 


than the error band around the point estimate for the mean value of Y; this will always be the 
case. 


Restrictions 

In order to calculate the predicted values for Y, the function PREDICT requires the estimated coeffi- 
cients and the degrees of freedom, among other things, resulting from a regression analysis. For this 
reason, it is essential that the function be executed only immediately following an execution of the 
function REGR. 

Methodology 

The function considers the global variables B (estimated coefficients) and SEE (the standard error of 
the estimate) as produced by the function БЕСЕ and computes the required point estimates and 
corresponding error bands. The appropriate formulae are given on pages 153-155 of Johnston. (See 
References.) 


References 


Foot, D.K. and A. North. The Use and Misuse of Econometrics (With Reference to SHARP APL), 
Second Edition, University of Toronto and LP. Sharp Associates Limited, 1977. 


Johnston, J. Econometric Methods, Second Edition, McGraw-Hill, New York, 1972. 
Source 
A. North 


І.Р. Sharp Associates Limited 
Ottawa, Ontario 
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REGR 
Syntax 
Y REGR X 
Description 
The function REGE. performs a simple or multiple regression analysis. State setting functions control 
the inclusion of an intercept in the equation specification, as well as the calculation and printing of 


such optional features as an analysis of variance table, a correlation matrix, a list of residuals, and 
a variance-covariance matrix of estimated coefficients. 


Arguments 
Y  - A vector of Ñ observations on the dependent variable. 
X - An ПХК matrix of observations on each of X independent variables. Each column represents 


a variable such that the first variable to be entered into the regression equation is in the first 
column, the second variable to be entered is in the second column, and so on. In the case of 
a simple regression, when the model contains only one independent variable, the right argu- 
ment X may be either an /х1 matrix or a vector of length N. 


State Setting Functions 


Optional output features for REGE may be selected by altering the state, which is defined by a group 
of state setting functions. These functions, with bracketed alternatives where applicable, are: 


STATE 
Displays the current state. 


DEFAULT 
Restores the state settings to the default values. 


ANOVA LNOANOVA] 
Generates an analysis of variance table listing sources of variation, degrees of freedom, sums of 
squares, mean squares and sequential F-statistics. The default setting is ANOVA. 


CONSTANT LNOCONSTANT] 
Dictates that an intercept (constant term) be included in the equation specification. A selection 
of the alternative VOCONSTANT forces the estimated regression line through the origin. The default 
setting is CONSTANT. 


CORRELATION LNOCORRELATION) 
Generates a correlation matrix containing simple correlation coefficients in its upper triangular 
portion. The lower triangle contains the t-statistics for testing for significant differences from zero 
of these coefficients. The first row of the matrix contains coefficients of correlation between the 
dependent variable and each of the independent variables, and the first column the corresponding 
t-statistics. The default setting is CORRELATION. 


RESIDUALS LNORESIDUALS] 


Prints a list of all observed values of the dependent variable, along with the corresponding 
estimated values and residuals. The default setting is NORESIDUALS. 
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COVARIANCE LNOCOVARIANCE] 
Displays the variance-covariance matrix of the estimated regression coefficients. The default 
setting is VOCOVARIANCE. 


Output 
In addition to the optional features described above, the following fundamental statistics are printed: 


1) Тһе mean of the dependent variable. 

2) Means, estimated coefficients and their standard errors and computed t-values for each indepen- 
dent variable (including the constant term, where applicable). 

3) Such equation statistics as the F-statistic for testing for the significance of the overall regression 
(with corresponding degrees of freedom), the standard error of the estimate, the coefficient of 
multiple determination (R-squared), R-squared corrected for degrees of freedom, the Durbin- 
Watson statistic and the coefficient of variation (measured in percentage terms at the mean of 
the dependent variable). 


Example 
Taken from Foot and North, page 47. See References. 


%18,2Р8.1! ÜFMT(AP;POP;GNP) 
2780 750.0 80. 


930 550.0 42.0 
2955 244.0 466.0 
3161 205.0 932.0 
325 128.0 16.0 
385 118.0 10.0 
259 104.0 167.0 
466 59.0 150.0 
380 $8.0 109.0 
413 54.0 82.0 
506 51.0 140.0 
33 39.0 1.0 
155 36.0 6.0 
478 35.0 14.0 
282 33.0 27.0 
242 33.0 81.0 
288 33.0 6.0 
645 32.0 8.0 
161 28.0 9.0 
143 28.0 2.0 
433 22.0 2.0 
238 21.0 11.0 
93 21.0 67.0 
181 20.0 20.0 
nn 18.0 3.0 
129 17.0 32.0 
50 16.0 3.0 
523 14.0 5.9 
413 14.0 3.0 
168 14,0 28.0 
57 18.0 3.0 
121 13.0 28.0 
85 13.0 32.0 
48 11.0 4.0 
186 10.0 5.0 
102 10.0 14.0 
95 10.0 22.0 
159 9.0 8.0 
149 9.0 8.0 
95 9.0 3.0 
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E 
о кә кә кә кэ ш ш ш ш Ф л «лал туз за 3 
росозооеооеооооооов 
оороооеоемеооооооосо 


STATE 
CONSTANT 
ANOVA 
CORRELATION 
NORESIDUALS 
WOCOVARIARCE 

COVARIANCE 
WAS NOCOVARIANCE 


AP REGE N2 590РОР.СНР 
ALIGN PAPER 


CORRELATION MATRIX (WITH T-VALUES) 


1.00000 0.22176 0.79447 
7.87300 1.00000 0.31576 
9.87666 2.51252 1.00000 
MEAN ОР DEPENDENT VARIABLE 331.91525 
VARIABLE MEAN ESTIMATED COEFFICIENT STD. ERROR 
CONSTANT TERM 57.758926 32.96632 
1 50.41186 2.72747 0.25705 
2 45.85085 2.98051 0.23347 
SOURCE ОР VARIATION DF SUM OP SQUARES MEAN SQUARE 
MEAN t 6499896.42373 
REGRESSOR:X 1 1 12530078.74296 12530078.74296 
х2 1 8575795.35191 857579%.35191 
RESIDUAL 56 2996735, цВ140 52620.276u5 
TOTAL 59 30552505.00000 


MULTIPLE CORRELATION COEFFICIENT (Е+2) 
CORRECTED R*2 (6ж2)...... 
FP-STATISTIC FOR SIGNIFICANCE на REGRESSION 
STANDARD ERROR OF THE ESTIMATE.... 
DURBIN-WATSON STATISTIC....... . 
COEFFICIENT OP VARIATION (АТ THE MEAN “OF Dn. 


0.877487904% 
0.873112%72% 
200. 5488617446 
229.3910993337 
2.4354323959 
69.1113458647 


VARIANCE-COVARIANCE MATRIX OP ESTIMATED COBPFICIENTS 


1086.7785 72.4621 1.5439 
72.4621 0.0661 70.0190 
1.5439 70.0190 0.0545 
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T-VALUE 
1.75207 

10.61062 

12.76617 


F-STATISTIC 


238.12263 
162.97509 


Notes and Hints 


1) 


2) 


4) 


5) 


6) 


7) 


The input matrix X of independent variables need not include a column of ones to represent the 
constant term. Ап intercept is automatically inserted into the specification equation by the function 
REGR, whenever the (default) state setting CONSTANT is in effect. 


Тһе F-statistics in the analysis of variance table are sequential F-statistics, i.e., ЕЗ provides а 
test of the significance to the regression of the independent variable X3 given that the variables 
X1 and X2 have already been included in the regression equation. Thus the sequence of the 
columns in the input matrix X of independent variables affects the results printed in the analysis 
of variance table (although it has no bearing upon the estimated coefficients or resulting equation 
statistics). 


The functions generate seven global variables (DDD, SEE, YHAT, B, DF, ЛЕ, ASE). И vari- 
ables of any of these names exist in your workspace, the original values of these variables will 
be overwritten. The variables АЁ and YHAT are х1 matrices containing the residuals and the 
calculated Y-values respectively. These may be useful for subsequent plotting purposes. The 
matrix ASE is also generated and available in the active workspace following the execution of 
REGR. This is the variance-covariance matrix of estimated coefficients and may be of use in 
performing more complicated hypothesis tests concerning possible relationships between the es- 
timated coefficients. 


Whenever an equation is estimated involving (log Y) as the dependent variable, all equation 
statistics ( R-squared, the standard error of the estimate, etc.) are measured in units of (log Y). 
The function CONVERT, if executed immediately following an execution of REGP, reproduces all 
equation statistics in terms of the original units of the dependent variable. (See the documentation 
of CONVERT.) 


The function PREDICT, if executed immediately following an execution of REGR, provides point 
estimate forecasts and their corresponding error bands based on the equation estimated by 
REGR. (See the documentation of PREDICT.) 


The error message DOMAIN ERROR may be printed as a result of the function’s attempt to invert 
a singular matrix. This occurs whenever any of the independent variables is an exact linear 
combination of any or all of the other independent variables, and is a problem which must be 
solved by the analyst. 


The functions GZS, INSTRUMENTAL, STAGE3, COCHRANESORCUTT and HILDRETHALU avail- 
able in this workspace may be useful for subsequent analysis. GLS performs a generalized least 
squares analysis, INSTRUMENTAL an instrumental variable substitution, and STAGES а two- and 
three-stage least squares analysis. COCHRANEAORCUTT and HILDRETHALU provide alternative tech- 
niques for autocorrelation adjustment. 
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References 
Johnston, J. Econometric Methods, Second Edition, McGraw-Hill, New York, 1972. 


Foot, D.K. and A. North. The Use and Misuse of Econometrics (With Reference to SHARP APL), 
Second Edition, University of Toronto and І.Р. Sharp Associates Limited, 1977. 


Source 
A. North 


LP. Sharp Associates Limited 
Ottawa, Ontario 
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STAGE3 

Syntax 

Y STAGES X 

Description 

The function STAGE3 performs a two- or three-stage least squares analysis of a system of equations. 

The state setting functions CONSTANT/NOCONSTANT and NOCOVARIANCE/COVARIANCE in this work- 

space control optional input and output, respectively. 

Arguments 

Y - A matrix of N rows whose columns contain all dependent variables and all jointly dependent 
variables (ie. variables which also appear on the right-hand side of one or more of the 
structural equations in the system). The sequence of the columns (variables) is immaterial, 
since explicit column numbers for all variables must be specified as input (see below). 

X - A matrix of N rows whose columns contain all independent variables in the system, excluding 
jointly dependent variables. A column of ones need not be included to represent an intercept 
term; inclusion of а constant term is controlled by the state setting function 
CONSTANT /NOCONSTANT. 

Input 


The user is first asked whether a two-stage or three-stage analysis (or both) is desired. Then the 
following user inputs are required: 


1) The number of structural equations in the system. This should include all equations which are 
to be estimated simultaneously, exclusive of any definition equations. These latter equations can 
be estimated separately using the function REGR in this workspace. 


2) The column numbers of the dependent variable in each of the equations. These are the column 
indices in Y of the variables which appear on the left-hand side of each equation. 


3) The column number(s) of the jointly dependent variable(s) in each equation. These are the 
column indices in Y of the dependent variables which appear on the right-hand side of each 


equation. 


4) Тһе column number(s) of the independent variable(s) in each equation. These are the column 
indices in X of the independent variables appearing on the right-hand side of each equation. 


Output 
Standard output includes: 
1) The estimated coefficients, standard errors and t-values Гог all equations for both the three-stage 


analysis and the intermediate (two-stage) analysis, as requested. Included is an indication as to 
whether each variable is a Y or X variable on the right-hand side. 
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2) Equation statistics for each equation following each stage of the analysis. These include corrected 
and uncorrected R*2, the overall F-statistic with its degrees of freedom, the standard error of the 
estimate, the coefficient of variation (measured in percentage terms at the mean of Y), and the 
Durbin-Watson statistic. 


As an optional output, the variance-covariance matrix of both the two-stage and three-stage estimators 
is available whenever the state setting COVARIANCE is in effect. 


Example 


Taken from Foot and North, page 218. See References. 


"Polit ОРМТ Y 
41.9 0.2 25.5 
45.0 1.9 — 29.2 
33.2 s2 за 
50:6 30 axs 
52.6 у оза 
561 5.6 ала 
56:2 б? 379 
57.3 so 38.2 
$7.8 Ул зз 
55:0 10 3719 
80.40 Daa 34.5 
asie — 75:2 29.0 
чыз 7$ 2805 
ат 23:0 30.6 
эз 7613 312 
57.7 211 зв:8 
58.7 29004. 
57:5 7i 28:2 
61.6 1 ав 
65.0 зз амо 
89.7 #9 зз 


"ра 1“ ОРИТ Y 
2.7 22 5.6 1.9 
21% sls 6л 1.0 
DE “7 s 110 
за за 5:6 1.0 
3.2 D 6.5 ile 
DEI 7;0 6.6 1.0 
Dn єт 7.6 1:0 
эт “2 7.8 1.5 
n ne ва 1.0 
D D ale 1:0 
DH TS 10:7 1.0 
sia вз 10:2 19 
5.5 8% 823 1:0 
6:0 818 10:0 1:0 
Dn 7.2 108 1.0 
7. B3 103 1.0 
617 6:7 100 10 
T3 TX i30 1:0 
а ИТУ! 1:0 
8:2 DEC 1:0 
as ome 22.3 110 
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COVARIANCE 

МАЗ NOCOVARIANCE 
NOCONSTANT 

VAS CONSTANT 


т STAGES І 


TWO-STAGE, THREE-STAGE OR BOTH (TYPE 2, 3 OR BOTH)? BOTA 


NUMBER OF STRUCTURAL (WON-DEFINITION) BQ 


COLUMN AUMBERS OF DEPENDENT VARIABLES FOR TRE 3 EQUATIONS 1 2 3 


ENTER COLUNM KUMBERS OF THE POLLOWIN( 


JOINTLY DEPENDERT VARIABLE(S) ON R.B.S. 
INDEPERDENT VARIABLE(S) ON R.A.S. OF BQ. 


JOINTLY DEPENDENT VARIABLE(S) ON R.H.S 
INDEPENDENT VARIABLE(S) ON 8.8.5. OF FQ. 


JOINTLY DEPENDENT VARIABLE(S) OR R.H.S. 
INDSPERDENT VARIABLE(S) ON 8.8.8. ОР EQ. 


VATIONS 3 


OF кд. 1:45 
1: 58 

OP EQ, 2: а 
2. 56а 

OF EQ. 3 


з ка 


EQUATION VARIABLE COBPPICIEMT S 
1 Tow 9.0173 
Tos 
тоз 
тов 18.5548 
2 тов 0.1502 
19% 016159 
тов 70.1878 
тов 20.2782 
3 196 0.5389 
107 0.1a67 
тоа 9: 1304 
19% 1.5003 


TD. PRROR 


0.0139 “0.0015 70.0098 70.0153 
70.000 — 70.0327 

.0.0115 — 70.0048 

10048 1.7995 


STATISTICS ZOR EQUATION 2 

MULTIPLE CORRELATION COEFPICIENT (F 82) 
CORRECTED #2 (Eri) 

Greast P-staristic (9,17) 

SIDERFOR OF TRE ESTIMATE 
COEFFICIENT OP VARIATION 

BUPBIN CMATSOY. STATISTIC 


MULTIPLE CORRELATION COEPRICIENT (R92) 
CORRECTED Re? (ба?) 

OVERALL F-STATISTIC (3,1) 

STD. ERROR OF THE ESTIMATE 
COEFPICIENT OF VARIATION 
DUFBIN-WATSON STATISTIC 


0.976711 
0.972601 
237.649508 
11135659 
2.103257 
1.495072 


ШП 
0.864569 
43.559005 
1.307149 
103.195980 
21085336 


T-VALUE 
0.1656 
20.1289 
2.0158 
12.5340 


0.8622 
3.7838 
EM 
2.6885 


70.0258 
9.9255 
700039 
0,1176 
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0.0052 
70.0038 

0.0013 
70.2690 


.9248 
17175 
11690 
К 


.0013 


0012 


10003 
10067 


70.0012 
9.0015 


7010154 


70.0003 
9,0001 
0.0008 
0.0156 


10.0067 
70.0154 
0:0156 
1:378 


MULTIPLE CORRELATION СОБРРТСТЕНТ (R*2) 0.987418 


CORRECTED Ве? (фа?) 9.985193 
OVERALL F-STATISTIC (3,17) 94: 5588723 
STD. ERROR ОР THE ESTINATE 0.767155 
COEFFICIENT OF VARIATION 2.109778 
DURBIN-WATSON STATISTIC 1.963616 


3515 PARAMETER ESTIMATION 


EQUATION VARIABLE COEFFICIENT STD. ERROR 
1 You 9.0479 9.1125 
10% 9.8164 0.0388 
ros 0.1097 0.1086 
108 18.1926 113003 
2 точ 0.2110 0.1689 
Xos 0.5669 0.1508 
тов 0.1472 0.0387 
E 15.8242 7.2400 
E 106 0.9282 0.0347 
107 0.1524 9.0377 
хоз 0.1357 0.0287 
тов 1.6936 INC 


515 СОЕРРТСТЕНТЗ 
70.0014 70.0090 70.0184 0.0039 
0.0015 "0.0005 79.020% 
10.0005 0.010 "0.0063 
70.016, "0.030 1020063 1.590, 
0:0039 0.0013 7010048 7010458 
79.0030 70.0012 0.005) 0.0161 
0.0004 — 0.0004 70.000) — 70.0080 
0:015 70.0819 0.0150 2.133% 
70.0012 0.0000 0.0008 0.0063 
0.0013 70.0001 70.0012 0.0007 2 
0.0004 — 70.0003 9.0003 9.0012 70.0004 
0.0001 — 0.0011 0.0228 70.4322 0.0022 


MULTIPLE CORRELATION COEFFICIENT (R#2) — 0.978170 


CORRECTED я? (62) 0.978317 
OVERALL P-STATISTIC (3,17) 253.910119 
STD. ERROR ОР TRE ESTIMATE 1.099513 
COEPPICIERT OP VARIATION 2.026315 
DURBIN-WATSON STATISTIC 12992690 


STATISTICS FOR EQUATION 2 


MULTIPLE CORRELATION COEFFICIENT (F«2) —— 0.900398 


CORRECTED Не? (842) 0.882821 
OVERALL F-STATISTIC (3,17) 51.226328 
STD. ERROP OF THE ESTIMATE 11215482 
COBPPICIENT OF VARIATION 95:990711 
DURBIW-WATSON STATISTIC 2.09328 


MULTIPLE CORRELATION COBEPICIENT (Re?) 0.987219 
CORRECTED На? (8*2) 9.985076 
OVERALL F-STATISTIC (2,17) war. 028277 
STD. EPROP OF THE ESTIMATE 0.770180 
COEFFICIENT OF VARIATION 2.118098 
DURBIN-WATSON STATISTIC 2.038255 


f-VALUE 
Den 

21.159 
1.8133 

12,4536 


1.2897 
3.5691 
7912932 
2.8730 


2.3915 
410919 
427329 
124837 
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70.0004 
2 


1000% 
:0001 


“0.0080 


0 


10038 
10036 
10012 
12479 
10000 

0001 
10002 
20004 


0.0615 
70.0319 
0.0150 
-2:1339 
018532 
29.7115 
0.2479 
$2.5332 
0.0048 
0.0113 
0.0227 
7013850 


70.0012 
0:0000 
0:0008 

2820063 
0:0007 
519004 
0.0000 

70.0046 

0.0012 

10.0011 

20.9003 
0.0070 


0.0013 
10.0001 
70.0012 
0.0007 
0:0007 
16.0007 
“0.0001 
0.0113 
70.0011 
0.0014 
_9.0000 
0.0199 


0.000% 
70.0005 
0.0003 
0.0012 
0:000% 
0.0006 
70.0002 
10.0277 
0.0003 
9.0000 
9.0008 
0.0153 


0.0001 
0.0011 


Notes and Hints 


1) A response of 3 to the initial input prompt, indicating that a 3-stage least squares analysis is 
to be performed, suppresses the printing only of the second stage results; the calculation of these 
intermediate results is still required. A response of 2 suppresses both calculation and printing 
of the third stage results. 


2) The moment matrix of the structural disturbances of the second stage analysis is available to the 
user as the global variable ММ following execution of the function STAGES. 


References 


Zellner, A. and Н. Theil. “Three-Stage Least Squares: Simultaneous Estimation of Simultaneous 
Equations”, Econometrica, Volume 30, No. 1, January 1962. 


Theil, H. Principles of Econometrics, John Wiley & Sons Inc, New York, 1971, Chapter 10. 


Foot, D.K. and A. North. The Use and Misuse of Econometrics (With Reference to SHARP APL). 
Second Edition, University of Toronto and LP. Sharp Associates Limited, 1977. 


Source 
A. North 


LP. Sharp 
Оцама, Ontario 


iates Limited 
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32 STEPWISE 


FUNCTIONS 
Function 
Header Documentation Description 
ALPHA STEPWISEC DATA STEPWISECHOW Performs conversational stepwise 
regression. 
ALPHA STEPWISEFC FILEINFO STEPWISEFCHOW Performs conversational stepwise 


regression on data stored in a file. 
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STEPWISEC 
Syntax 
ALPHA STEPWISEC DATA 
Description 


STEPWISEC is a conversational program which performs stepwise linear regression, Entry of variables 
into the regression or removal of them from the regression may be either under computer or user 
control. 


Arguments 


ALPHA - This value determines whether the entry and removal of variables is to be computer- 
or user-controlled. In the first case ALPHA should have a value strictly between 0 
and 1, where (1-ALPHA) is the minimum probability level that must be attained by 
the F-ratio of an independent variable in order to qualify that variable for entry and 
to suppress its removal. In the second case ALPHA should be 0, in which case the 
user’s decision concerning entry and removal of variables will be requested by 
STEPWISEC. 

DATÀ - The matrix of observations, with the dependent variable being the last column and 
the independent variables the remaining columns. 


Output 
Each time an independent variable is selected for entry into the regression, the column number of 
the variable, the reduction in the proportion of variation, the reduction in the sum of squares, and 


the sequential F-test value and probability are printed. 


Each time an independent variable is selected for removal from the regression, the column number 
of the variable and the partial F-test value and probability are printed. 


If a variable is either entered or removed from the regression, the following will be printed: 

1) An analysis of variance table giving the source, SS, MS, and F. 

2) R-squared, R-bar-squared, the standard error of the estimate, and the intercept of the equation. 
3) A table giving, for each independent variable in the regression, the estimated coefficient, the 


standard error, the computed t, and the partial F-value. 


At the end of the stepwise analysis a table of residuals will also be printed. 
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Example 


Taken from pages 178-195 of Draper and Smith. See References. 


DATA 
7.0 26.0 6.0 60.0 78 
1.0 29.0 15.0 52.0 74 
11.0 56.0 8.0 20.0 104 
11.0 31.0 8.0 47.0 87 
7.0 52.0 6.0 33.0 95 
11.0 55.0 9.0 22.0 109 
3.0 71.0 17.0 6.0 102 
1.0 31.0 22.0 uu. 72 
2.0 54.0 18.0 22.0 93 
21.0 47.0 4.0 26.0 115 
1.0 40.0 23.0 34.0 83 
11.0 66.0 9.0 12.0 113 
10.0 68.0 8.0 12.0 109 
.05 STEPWISEC DATA 

VARIABLE SELECTED FOR ENTRY ............ COLUMN 4 

PROPORTION OF VARIATION REDUCED . 0.674542 

SUM OF SQUARES REDUCED ... sir 1831.895150 

SEQUENTIAL F-TEST VALUE (1,11) ........ 22.798520 

PROBABILITY .......... 0.999574 
PROBABILITY OF SEQUENTIAL Е GREATER THAN 1 - о = 0.950000 
COLUMN 4 ACCEPTED FOR ENTRY INTO REGRESSION. 
ANALYSIS OF VARIANCE 
SOURCE DF MS F 
MEAN 1 118372.3269 
REGRESSION 1 1831.8962 1831.8962 22.7985 
(COL. 4) 

RESIDUAL 11 883.8669 80.3515 

TOTAL 13 121088.0900 

R SQUARED 0.674542 

R-BAR SQUARED . s% 0.644955 

STANDARD ERROR OF THE ESTIMATE 8.963902 

INTERCEPT OF THE EQUATION ..... 117.567931 
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VARIABLE REGR. COEFF. STD. ERROR. COMPUTED T PARTIAL F 


COL. 4 70.738162 2.605163 70.283346 22.798520 


VARIABLE SELECTED FOR ENTRY .. COLUMN 1 
PROPORTION OF VARIATION REDUCED 0.297929 
SUM OF SQUARES REDUCED ......... 809.104805 
SEQUENTIAL F-TEST VALUE (1,10) 108.223909 

PROBABILITY ... 0.999999 


PROBABILITY OF SEQUENTIAL Е GREATER THAN 1 - a = 0.950000 
COLUMN 1 ACCEPTED FOR ENTRY INTO REGRESSION. 


ANALYSIS OF VARIANCE 


SOURCE DF ESI MS F 
MEAN 1 118372.3269 
REGRESSION 2 2641.0010 1320.5005 176.6270 

(COL. 4, 1) 
RESIDUAL 10 74.7621 7.0762 
TOTAL 13 121088.0900 
R SQUARED ..... 0.972471 
R-BAR SQUARED ... 0.966965 
STANDARD ERROR OF THE ESTIMATE 2.734266 
INTERCEPT OP THE EQUATION ..... 103.097382 

VARIABLE REGR. COEFF. STD. ERROR. COMPUTED T PARTIAL F 
COL. 4 70.613954 0.794655 “0.772604 159.295210 
COL. 1 1.439958 1.033887 1.392762 108.223903 
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VARIABLE SELECTED FOR ENTRY ............ COLUMN 2 


PROPORTION OF VARIATION REDUCED 0.009864 
SUM OF SQUARES REDUCED ........... PES 26.789383 
SEQUENTIAL F-TEST VALUE (1,9) ......... 5.025865 

PROBABILITY .......... 0.948438 


PROBABILITY OF SEQUENTIAL F LESS THAN 1 - a = 0.950000 
COLUMN 2 REJECTED FOR ENTRY. STEPWISE ANALYSIS TERMINATES. 


TABLE OF RESIDUALS 


OBSERVED Y ESTIMATED Y RESIDUAL 
78.500000 76.339872 2.160128 
74.300000 72.611751 1.688249 

104.300000 106.657850 72.357850 
87.600000 90.081102 2.481102 
95.900000 92.316620 2.983380 

109.200000 105.429943 3.770057 

102.700000 103.733535 71.033535 
72.500000 71.523380 75.023380 
93.100000 92.470318 0.629682 

115.900000 117.373711 71.873711 
83.800000 83.662917 0.137083 

113.300000 111.569479 1.730521 

109.400000 110.129521 70.729521 


Notes and Hints 


1) There is no need to include a column of 15 as an independent variable. Ап intercept is automa- 
tically included in the regression. 


2) If your data is stored in a file the function STEPWISEFC in this workspace can be used to obtain 
the same analysis performed by STEPWISEC. 


Methodology 


At each step of the regression the variable with the largest partial F-value is selected as a candidate 
for entry. Under computer control, the entry of the candidate is made according to whether the 
probability of its partial Е exceeds (1-ALPHA); under user control, the entry of the candidate is subject 
to the approval of the user. 


Subsequent to the entry of a variable, a test is made to see whether there are any variables whose 
partial Е is lower in value than that of the variable just entered. If so, the variable with the lowest 
partial F is selected as a candidate for removal. The removal of that variable is again conditional upon 
the decision of one of the two methods of control. 


The stepwise analysis terminates when there are no more variables for entry or removal, or whenever 
the entry of a selecied variable is rejected. 
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References 


Draper, N.R. and H. Smith. Applied Regression Analysis, John Wiley & Sons Inc., New York, 
1966. 


Efroymson, M.A. “Multiple Regression Analysis", Mathematical Methods for Digital Computers, 
John Wiley & Sons Іпс., New York, 1960. 


Source 
R. Hui 


ІР. Sharp Associates Limited 
Calgary, Alberta 


198 


STEPWISEFC 
Syntax 
ALPHA STEPWISEFC (FN, СФ, CN2, COLUMNS) 
Description 
STEPWISEFC is a conversational program which performs stepwise linear regression for data stored 


in a file. Entry of variables into the regression and removal of them from the regression may be either 
under computer or user control. 


Arguments 

ALPHA - The value is dependent on whether the entry and removal of variables are to be 
computer- or user-controlled. In the first case ALPHA should have a value strictly 
between 0 and 1, where (1-ALPHA) is the minimum probability level that must Бе 
attained by the F-ratio of an independent variable in order to qualify that variable 
for entry and to suppress its removal. In the second case ALPHA should be 0, and 
the user’s decision concerning entry and removal of variables will be requested by 
STEPWISEFC. 

FN - Тһе Ше tie number of the data file. 

ст - The lower limit of the file components to be read. 

CN2 - One plus the upper limit of the file components to be read. 

COLUMNS - The column numbers of the independent variables foliowed by the column number 


of the dependent variable. 
Format of the Data File 
The data should be stored in contiguous components of the file, with a numeric matrix residing in 
each component. The matrices need not necessarily be of the same size. Each time a component of 
a file is read, only the columns specified by COLUMNS are taken; the other columns are ignored. 
Output 
Each time an independent variable is selected for entry into the regression, the column number of 
the variable, the reduction in the proportion of variation, the reduction in the sum of squares, and 


the sequential F-test value and probability are printed. 


Each time an independent variable is selected for removal from the regression, the column number 
of the variable and the partial F-test value and probability are printed 


If a variable is either entered or removed from the regression, the following will be printed: 

1) An analysis of variance table giving the source, SS, MS, and Е. 

2) R-squared, R-bar-squared, the standard error of the estimate, and the intercept of the equation. 
3) А table giving, for each independent variable in the regression, the estimated coefficient, the 


standard error, the computed t, and the partial F-value. 


At the end of the stepwise analysis a table of residuals may optionally be printed. 
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Example 
Taken from pages 178-195 of Draper and Smith. See References. 


0 STEPWISEFC 100 11412345 


VARIABLE SELECTED FOR ENTRY ............ COLUMN 4 
PROPORTION OF VARIATION REDUCED .. в 0.674542 
SUM OF SQUARES REDUCED me . 1831.896160 
SEQUENTIAL P-TEST VALUE (1,11) .. T 22.798520 
PROBABILITY .......... 0.999574 
ENTER COLUMN 4? YES 
COLUMN 4 ACCEPTED FOR ENTRY INTO REGRESSION. 
ANALYSIS OF VARIANCE 
SOURCE DF 55 MS 
MEAN 1 118372.3269 
REGRESSION 1 1831.8962 1831.8962 
(COL. ц) 
RESIDUAL 11 883.8669 80.3515 
TOTAL 13 121088.0900 
R-SQUARED... S uve Rees etx dd 0.675542 
R-BAR SQUARED ................. 0.645955 
STANDARD ERROR OF THE ESTIMATE 8.963902 
INTERCEPT OF THE EQUATION ..... 117.567931 
VARIABLE REGR. COEFF. STD. ERROR. COMPUTED T 
COL. 4 70.738162 2.605163 70.283356 
VARIABLE SELECTED FOR ENTRY ............ COLUMN 1 
PROPORTION OF VARIATION REDUCED . 0.297929 
SUM OF SQUARES REDUCED .......... 809.104805 
SEQUENTIAL F-TEST VALUE (1,10). $ 108.223909 
PROBABILITY .......... 0.999999 


ENTER COLUMN 1? YES 
COLUMN 1 ACCEPTED FOR ENTRY INTO REGRESSION. 
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22.7985 


PARTIAL F 


22.798520 


ANALYSIS OF VARIANCE 


1320.5005 
7.4762 

+ 972474 

+ 966965 

«734266 

-097382 


COMPUTED T 


176.6270 


PARTIAL F 


“0.772604 


159.295210 


SOURCE DF SS 
MEAN 1 118372.3269 
REGRESSION 2 2641.0010 
(COL. 4, 1) 
RESIDUAL 10 74.7621 
TOTAL 13 121088.0900 
R SQUARED + 0 
R-BAR SQUARED . 0 
STANDARD ERROR OF THE ESTIMATE 2 
INTERCEPT OF THE EQUATION ..... 103 
VARIABLE REGR. COEFF. STD. ERROR. 
COL. 4 70.613954 0.79u655 
COL. 1 1.439958 1.033887 


VARIABLE SELECTED FOR ENTRY 
PROPORTION ОР VARIATION REDUCED 
SUM OF SQUARES REDUCED ......... 
SEQUENTIAL F-TEST VALUE (1,9) 

PROBABILITY .. 


ENTER COLUMN 2? NO 


1.392762 


COLUMN 2 
0.009865 
26.789383 
5.025865 
0.948438 


108.223909 


COLUMN 2 REJECTED FOR ENTRY.  STEPWISE ANALYSIS TERMINATES. 


RESIDUALS? NO 


Notes and Hints 


1) If either CN1 or CN2 is invalid, the defaults (П512Е FN)[1) for СМ1 and (П512Е FNW)[2] for 


CN2 are taken. That is, every component of the file will be read. 


2) There is no need to include a column of 17 as an independent variable. An intercept is automa- 


tically included in the regression. 


3) A request for a table of residuals necessitates that a second pass be made over the data file. 


Methodology 


At each step of the regression the variable with the largest partial F-value is selected as a candidate 
for entry. Under computer control, the entry of the candidate is made according to whether the 
probability of its partial F exceeds (1- ALPHA); under user control, the entry of the candidate is subject 


to the approval of the user. 
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Subsequent to the entry of a variable, a test is made to see whether there are any variables whose 
partial F is lower in value than that of the variable just entered. If so, the variable with the lowest 
partial F is selected as a candidate for removal. The removal of that variable is again conditional upon 
the decision of one of the two methods of control. 


The stepwise analysis terminates when there are no more variables for entry or removal, or whenever 
the entry of a selected variable is rejected. 


References 


Draper, N.R. and H. Smith. Applied Regression Analysis, John Wiley & Sons Inc., New York, 
1966. 


Efroymson, M.A. “Multiple Regression Analysis”, Mathematical Methods for Digital Computers, 
John Wiley & Sons Inc., New York, 1960. 


Source 
R. Hui 


LP. Sharp Associates Limited 
Calgary, Alberta 
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LIBRARY 33 
PROBABILITY FUNCTIONS AND DISTRIBUTIONS 


Workspaces: PROBDIST 


RANDOM 
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Function 


Header 


ReV BINOM Р 


ReNP BINOMIALPROB X 


ReDF CHISQUARE P 


Рей CHISQUAREPROB X 


ReN FPROB X 


ReGAUSS P 


Rel HYPERGEOPROB X 


ReN MATCHPROB X 


В<Р NBIN К 


КЄР NORMAL D 


ReMS NORMALPROB X 


Re-NORMORD X 


ReN POIPROB М 


33 PROBDIST 


FUNCTIONS 


Documentation 


BINOMHOW 


BINOMIALPROBHOW 


CHISQUAREHOW 


CHISQUAREPROBHOW 


FPROBHOW 


GAUSSHOW 


HYPERGEOPROBHOW 


MATCHPROBHOW 


NBINHOW 


NORMALHOW 


NORMALPROBHOW 


NORMORDHOW 


POIPROBHOW 
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Description 


Generates binomial probabili- 
ties (cumulative). 


Computes probabilities for the 
binomia] distribution. 


Computes the chi-square value, 
given the probability. 


Computes probabilities for the 
chi-square distribution. 


Computes probabilities for the 
F distribution. 


Computes Z (unit normal) 
when given the probability. 


Computes probabilities for the 
hypergeometric distribution. 


Computes probabilities for the 
number of matches. 


Computes successive terms of 
the negative binomial 
probability distribution. 


Generates normal probabilities. 


Computes probabilities for the 
normal distribution. 


Computes the value of the stan- 
dard normal density function at 
the point X. 


Generates a matrix with two 
columns and a row for each in- 
teger from zero to N. The first 
column contains individual 
term Poisson probabilities and 
the second the corresponding 
cumulative probabilities. 


ReX POIPROBC M 


ReX POIPROBI М 


Rey POISSON М 


R«M POISSONPROB X 


N POITABLE M 


ReP POIVAL M 


ReN RUNSPROB X 


Е+М SIGNEDRANKPROB X 


R+DF STUDENT P 


R-N TPROB T 


POIPROBCHOW 


POIPROBIHOW 


POISSONHOW 


POISSONPROBHOW 


POITABLEHOW 


POIVALHOW 


RUNSPROBHOW 


SIGNEDRANKPROBROW 


STUDENTHOW 


TPROBHOW 


Computes the Poisson cumula- 
tive probability at X. 


Computes the Poisson 
probability at the point X. 


Generates Poisson probabilities. 


Computes probabilities for the 
Poisson distribution. 


Generates a fully labelled table 
of Poisson individual апа 
cumulative probabilities for the 
integers from zero to N. 


Computes the smallest value 
for which the cumulative Pois- 
son probability exceeds or is 
equal to the input argument Р. 


Computes probabilities for the 
runs test. 


Computes probabilities for the 
signed-rank test. 


Computes the value of the t 
distribution, given the 
probability. 


Computes probabilities for the 
t distribution. 


BINOM 
Syntax 
ReV BINOM P 
Description 
BINOM generates the probabilities of obtaining I successes out of N trials for each of 1=0 to I=N, 


where P is the probability of success in any one trial. The resultant probabilities are then grouped 
into classes as defined by the vector V. 


Arguments 
P - Тһе probability of obtaining a success in any one trial. 
V  - Consists of four numbers: 


Тһе left-hand end point of the first class, the class width, the number of classes, and the 
number of binomial trials per experiment (N). 


Result 

R - The binomial probability of obtaining the number of successes falling in each class defined 
by V. The first and last entries in R are probabilities of the number of successes falling into 
either of the tails. 

Example 
Re(0 1 3 3) BINOM .1 
R 

0 0.729 0.243 0.027 0.001 


Notes and Hints 


1) BINOM is limited to about 240 trials (N) per experiment; for larger numbers, the binomial 
coeíficients become too large for the capabilities of the machine. 


2) For individual probabilities see the documentation Гог BINOMIALPROB in this workspace. 
References 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons тс., New York, 1961. 


Dixon, W.J. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, London, 1957. 
Source 

L. Gibson, S. Maxwell, S. Swaminathan 

Institute of Computer Science 


University of Guelph 
Guelph, Ontario 
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BINOMIALPROB 
Syntax 
R-NP BINOMIALPROB X 
Description 
BINOMIALPROB computes probabilities for the binomial distribution. 
Arguments 
NP - A 2-element vector: 


NP[1] - N, the number of trials. 
NPL2] - P, the probability of success on each trial. 


X - А variable(s). 
Result 
R - Contains probabilities based on the binomial distribution. If the variable, X, is a scalar, the 


result is a 2-element vector containing: 
PROBABILITY («<Х) and PROBABILITY (Х<ж). 
If the variable is a vector of length n, the result is an п--1 element vector, call it P, where: 


PLIJ-PROBABILITY (ҮГІ]<«<ҮЕ7+1]) 
and 
Y-(-INFINITY), X, (*INFINITY). 


Examples 


PROB+(3 .1) BINOMIALPROB 1 
PROB 

0.972 0.028 
PROB-(3 .1) BINOMIALPROB 0 1 2 
PROB 

0.729 0.243 0.027 0.001 
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Notes апа Hints 


1) The relative accuracy of the tail probabilities is one part in 10*6. Accuracy of other probabilities 
is as good as if they had been computed by subtraction from tail probabilities. 


2) Invalid input parameters are flagged with a DOMAIN ERROR. 
3) For an alternative method, see the documentation for BINOM in this workspace. 
References 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Inc., New York, 1961. 


Dixon, W.J. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, 1957. 
Source 
W. Knight 


University of New Brunswick 
Fredericton, N.B. 


208 


CHISQUARE 
Syntax 
ReDF CHISQUARE P 
Description 
CHISQUARE calculates the theoretical chi-square, given the degrees of freedom and the probability level. 


Arguments 


DP - Тһе degrees of freedom (scalar). 
P - The probability level. 


Result 


в - The calculated chi-square value. These are the numbers one typically finds in a chi-square 
table. 


Example 
С+10 CHISQUARE .95 
18. onis 
Notes and Hints 
For the inverse of this function, see the documentation for CHISQUAREPROB in this workspace. 


References 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Inc., New York, 196i. 


Prins, J. New Paltz Library of Statistical Programs, Fourth Edition, State University College, New 
Paltz, N.Y., April 1972. 


Source 
J. Prins 


State University College 
New Paliz, N.Y. 
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CHISQUAREPROB 
Syntax 
ReN CHISQUAREPROB X 
Description 
CHISQUAREPROB computes probabilities for the chi-square distribution. 
Arguments 


N  - Degrees of freedom. 
X - The variable(s). 


Result 


R - Contains probabilities based on the chi-square distribution. If the variable, X, is a scalar, the 
result is a 2-element vector containing: 


PROBABILITY (*<Х) and PROBABILITY (X<+). 
If the variable is a vector of length n, the result is an n+1 element vector, call it P, where: 
P[I]=PROBABILITY (Ү[І)<*<У[1+1]) 
and 
Y=(-INFINITY), X, (+INFINITY). 
Examples 
PROB+10 CHISQUAREPROB 18.3007 
PROB 
0.9499017954 0.05009820462 
РЕОВе10 CHISQUAREPROB 12.5 16 18.3 
PROB 
0.7470146767  0.1533529228 0.04952333902 0.05010906147 
Notes and Hints 


1) If the right argument X is a vector, it must be in ascending order. 


2) Тһе relative accuracy of the tail probabilities is one part in 10*6. Accuracy of other probabilities 
is as good as if they had been computed by subtraction from tail probabilities. 


3) Invalid parameters are flagged with a DOMAIN ERROR. 


4) For an inverse of this function, see documentation for CHISQUARE in this workspace. 
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Reference 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Іпс., New York, 1961. 


Source 
W. Knight 


University of New Brunswick 
Fredericton, N.B. 
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Syntax 
R-N FPROB X 
Description 


FPROB computes probabilities for the F distribution. 


Arguments 

N - А 2-element vector containing the degrees of freedom: 

N[1] - Numerator 

М2] - Denominator 

X - Тһе variable(s) in ascending order 

Result 

R  - Contains probabilities based оп the Р distribution. If the variable, X, is a scalar, the result 


is a 2-element vector containing: 
PROBABILITY (ж<Х) and PROBABILITY (X«*). 
If the variable is a vector of length n, the result is an n- 1 element vector, call it P, where: 
P[IJ-PROBABILITY (¥[I]<*<¥{I+1]) 
and 
Y=(-INFINITY), X, (+INFINITY). 
Examples 
PROB+(10 20) FPROB 2.35 
PROB 
0.9501759161 0.04982408387 
PROBe10 20 FPROB 1.94 2.35 3.37 
PROB 
0.9005548536 0.0496210625 0.03985104986 0.009973034017 
Notes and Hints 


1) The relative accuracy of the tail probabilities is one part іп 10*6. Accuracy of other probabilities 
is as good as if they had been computed by subtraction from tail probabilities. 


2) Invalid parameters are flagged with a DOMAIN ERROR. 
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Reference 


Dixon, W.J. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, 1957, Chapter 
8. 


Source 
W. Knight 


University of New Brunswick 
Fredericton, N.B. 
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GAUSS 

Syntax 

ReGAUSS P 

Description 

GAUSS calculates the value for Z (the unit normal) when the cumulative probability is given. 

Argument 

P - A probability (or vector of probabilities). 

Result 

R - The Z value (or vector of values) of the unit normal. 

Example 

£«-GAUSS .99 .95 .90 

2% 326755333 1.64521144 1.281728757 

Notes and Hints 

1) Тһе probability Р must be such that 0<Р<1. 

2) This function is really the inverse of that found in most texts. See the documentation for 
NORMALPROB in this workspace for the function which yields probabilities, given the Z value. 
Other related functions are NORMORD and NORMAL. 

References 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Inc., New York, 1961. 


Prins, J. New Paltz Library of Statistical Programs, State University College, New Paltz, N.Y. 
Source 
J. Prins 


State University College 
New Paltz, N.Y. 
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HYPERGEOPROB 
Syntax 
RN HYPERGEOPROB X 
Description 


HYPERGEOPROB computes probabilities for the hypergeometric distribution. 


Arguments 

N - A 3-element vector: 

NEA] - Number of items in sample 

N[2] - Size of population 1 

м3) - Size of population 2 

x - Number of elements of population 1 in sample 

Result 

R - Contains probabilities based on the hypergeometric distribution. If the variable, X, is a scalar, 


the result is a 2-element vector containing: 
PROBABILITY (xsX) and PROBABILITY (X«*) 
If the variable is a vector of length n, the result is an n 1 element vector, call it P, where: 
P[IJ]=PROBABILITY (YLIJs*«Y[I*41]) 
and 
Y=(-INFINITY), X, (+INFINITY). 
Example 
Suppose a box contains 25 parts, of which 18 (N[2]) are acceptable and 7 (W[3]) аге defective. 
If 5 (N[1]) parts are selected, without replacement, what are the probabilities of this sample’s 
containing 0, 1, 2, 3, 4 or 5 (X) acceptable parts? 
PROB-(5 18 7) HYPERGEOPROB 012345 
PROB 


0.000395256917 0.01185770751 0.1007905138 0.3225296443 
0.4031620553 0.1612648221 0 
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Notes and Hints 


1) The elements of the right argument X must be in ascending order, and the number of items in 
the sample (¥[1]) must not exceed 5240. 


2) The relative accuracy of the tail probabilities is one part in 10х6. Accuracy of other probabilities 
is as good as if they had been computed by subtraction from tail probabilities. 


3) Invalid parameters are flagged with a DOMAIN ERROR. 
Reference 


Steel, К.С. and J.H. Torrie. Principles and Procedures of Statistics, McGraw-Hill, New York, 
1960, Chapter 20. 


Source 
W. Knight 


University of New Brunswick 
Fredericton, N.B. 
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MATCHPROB 
Syntax 
Ве MATCHPROB X 
Description 
MATCHPROB computes probabilities for the number of matches. 
Arguments 


N - The number of objects permuted. 
X - The number of matches (in ascending order). 


Result 
R - И the variable, X, is a scalar, the result is а 2-element vector containing: 
PROBABILITY (*sX) and PROBABILITY (X«x). 
If the variable is a vector of length n, the result is an п+ 1 element vector, call it P, where: 
И PLI]-PROBABILITY (¥[I}s*<¥[I+1]), 
an 


Y-(-INFINITY), X, (+INFINITY). 


Examples 
PROB-20 MATCHPROB 5 
PROB 
0.9994058152 0.0005941848176 
PROB-20 MATCHPROB 1 2 3 
PROB 
0.7357588823 0.1839397206 0.0613132402 0.01898815688 
Notes and Hints 


1) The relative accuracy of the tail probabilities is one part in 10*6. Accuracy of other probabilities 
is as good as if computed by subtraction from tail probabilities. 


2) Invalid parameters are flagged with a DOMAIN ERROR. 
Restriction 

N has to be less than or equal to 56. 

Source 

W. Knight 


University of New Brunswick 
Fredericton, N.B. 
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Syntax 

ReP NBIN К 

Description 

NBIN computes with full accuracy successive terms of the negative binomial probability distribution. 
Argument 


P - The value of the parameter P of the distribution 
К - A vector of two values: 


K(1] - The value of the parameter К 
K[2] - The number of terms to be evaluated 
Result 


R - A vector of probabilities from the negative binomial distribution, for 
0, 1, 25.065 K[2]-1: 


Example 
Re.7 NBIN 3 У 
R 
0.343 0.3087 0.18522 0.09261 


This shows that the probability that the third successful trial will occur, after (say) 1 unsuccessful 
trial, is .3087, given that the probability of success on each trial is .7. 


Notes and Hints 

To obtain cumulative probabilities, the following can be executed: 
RRe+\P NBIN К. 

Restrictions 


A DOMAIN ERROR occurs when the combination of K[1] and X[2] produces a binomial coefficient 
that is too large for storage allocation. 
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Reference 


Dyckman, T., S. Smidt and A. McAdams. Management Decision Making Under Uncertainty, 
Collier-MacMillan Ltd., 1969, Chapter 6. 


Source 
Dr. G.H. McLaughlin 
Newhouse Communications Center 


Syracuse University 
Syracuse, N.Y. 
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NORMAL 
Syntax 
R-P NORMAL D 
Description 


NORMAL will generate the probabilities for each of N classes (described by P), using the normal 
distribution with mean and standard deviation given by D. 


Arguments 

D - A 2-element vector made up of the mean and standard deviation, respectively, of the desired 
distribution. 

P - Three numbers describing the classes as follows: 


left-hand end point of the first class, class width, number of classes. 

Result 

R - The explicit result produced by NORMAL. Contains the normal probabilities requested (1 per 
class), as well as the probability of an event occurring in either tail of the distribution. These 
latter values are tacked onto the respective ends of R. 

Example 
РЕВе(0 .1 5) NORMAL (0 1) 

DATA MEAN -0 AND STANDARD DEV. =1 
PRB 

0.5 0.03959513174 0.03953569348 0.03883795504 0.03764641813 
0.03607965112 0.3083051505 

Notes and Hints 

1) The number of classes must be given as an integer value. 


2) A normal fit may not be too valuable if there are 15 or more classes. 


3) By multiplying R by the sample size, the result of this function can be used for the chi-square 
test. 


4) Functions GAUSS, NORMALPROB, and NORMORD provide alternatives for working with the normal 
distribution. 
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Reference 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Іпс., 1961. 


Source 


L. Gibson, S. Maxwell, S. Swaminathan 
Institute of Computer Science 

University of Guelph 

Guelph, Ontario 


NORMALPROB 
Syntax 
ReMS NORMALPROB X 
Description 
NORMALPROB computes probabilities for the normal distribution. 
Arguments 
MS - А 2-element vector: 


MS[1] - Mean of the normal distribution. 
MS{2] - Standard deviation of the normal distribution. 


X - 'The variable(s) in ascending order. 
Result 
Е - Contains probabilities based on the normal distribution. If the variable, X, is a scalar, the result 


is а 2-element vector containing: 
PROBABILITY (*5Х) and PROBABILITY (X<*). 
If the variable is a vector of length n, the result is an n+1 element vector, call it P, where: 
P{IJ=PROBABILITY (ҮГІ)<*<Ү[1+1]) 
-€ Y=(-INFINITY),X, (+INFINITY) 
Examples 
PROB<(0 1) NORMALPROB 2.32678 
PROB 
0.9900115113 0.009988488705 
РЕОВе0 1 NORMALPROB 1.28 1.645 2.33 


PROB 
0.899727432 0.05028766242 0.04008182998 0.009903075559 


222 


Notes and Hints 


1) The relative accuracy of the tail probabilities is one part in 10*6. Accuracy of other probabilities 
is as good as if they had been computed by subtraction from tail probabilities. 


2) Invalid parameters are flagged with a DOMAIN ERROR. 


3) For the normal (0,1), the function GAUSS yields the Z value, given the probability. See the 
documentation for GAUSS in this workspace. 


Restriction 
The normalized value(s) of X; i.e., | (X-MEAN) +57р|, must be less than 18. 
Reference 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Inc., 1961. 


Source 
W. Knight 


University of New Brunswick 
Fredericton, N.B. 
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NORMORD 

Syntax 
ReNORMORD X 
Description 
NORMORD computes R=f(X), the density of the normal (0,1) probability function at the point X. 
Arguments 
X  - The point at which the density of the normal (0,1) probability function is computed. 
Result 
R - Тһе density. 
Example 

ReNORMORD ^.05 05 
0. T S 0.3984439144 
Notes and Hints 
1) The exact formula is used. 


2) Other functions in this workspace concerning the normal distribution are GAUSS, NORMAL and 
NORMALPROB. 


Reference 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Inc., 1961. 


Source 


LCDR. J. Cook 
US. Navy 
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POIPROB 
Syntax 
ReN POIPROB М 
Description 
POIPROB gives a matrix of Poisson individual and cumulative probabilities. 
Arguments 


N - Determines the number of terms to be calculated. 
M - The mean of the desired Poisson distribution. 


Result 
R - А (N+1) or [N by 2 matrix. The first column contains the individual Poisson terms and the 


second column contains the corresponding cumulative terms. The first row contains the 
probabilities for 0, the second row for 1, the Nth row for N-1, etc. 


Example 
Te5 POIPROB .8 
T. 
0.4493289641 0.4493289641 
0.3594631713 0.8087921354 
0.1437852685 0.9525774039 
0.03834273827 0.9909201422 
0.007668547654 0.9985886899 
0.001226967625 0.9998156575 


Notes and Hints 


1) Since the Poisson is defined only for mean М>0 and for X є 0,1,2,3..., the result is empty 
if М<0. 


2) If N is less than zero it is treated as zero. 
3) Тһе related function POITABLE prints a fully labelled table of Poisson probabilities; other Poisson 


functions in this workspace are POISSON, POISSONPROB, POIPROBI, POIPROBC and 
POIVAL. 
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References 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Inc., 1961. 


Dixon, W.J. and Е.Ү. Massey. Introduction to Statistical Analysis, McGraw-Hill, 1957. 
Steel, К.С. and J.H. Torrie. Principles and Procedures of Statistics, McGraw-Hill, 1960. 
Source 


LCDR. J. Cook 
U.S. Navy 
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POIPROBC 
Syntax 
ReX POIPROBC М 
Description 


POIPROBC computes the Poisson cumulative probability at X. 


Arguments 

X - Specifies up to which integer the Poisson cumulative probability is to be calculated. 
M - The mean of the desired Poisson distribution. 

Result 

R - The cumulative probability. 

Example 


Pe5 POIPROBC .8 
реди 
Notes and Hints 
1) Since the Poisson is defined only for М>0, R is set to -1 И MsO. 


2) Other related functions in this workspace are POISSON, POISSONPROB, POIPROBI, POIPROB, 
POITABLE and POIVAL. 


References 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Inc., 1961. 


Dixon, W.J. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, 1957. 
Steel, R.G. and J.H. Torrie. Principles and Procedures of Statistics, McGraw-Hill, 1960. 
Source 


LCDR. J. Cook 
1.8. Navy 
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POIPROBI 
Syntax 
ReX POIPROBI M 
Description 
POIPROBI computes the Poisson individual probability for the value X. 
Argument 


X - The value for which the Poisson individual term probability is to be calculated. 
M - The mean of the desired Poisson distribution. 


Result 

В - Тһе Poisson individual probability. 

Example 

Taken from Dixon and Massey, page 352. See References. 
Ре0 POIPROBI .8 

0 mo 

Notes and Hints 


1) Ris set to -1 И MSO because the Poisson is defined only for М>0. 


2) Other Poisson related functions in this workspace are POISSON, POISSONPROB, POIPROBC, 
POIPROB, POITABLE and POIVAL. 


References 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Inc., 1961. 


Dixon, W.J. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, 1957, 
Steel, R.G. and J.H. Torrie. Principles and Procedures of Statistics, McGraw-Hill, 1960. 
Source 


LCDR. J. Cook 
US. Navy 
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POISSON 
Syntax 
ReV POISSON М 
Description 


POISSON will generate expected Poisson probabilities associated with each of a group of classes, where 
M is the population mean and V describes the classes. 


Arguments 


V  - Consists of three numbers: 
- left-hand end point of the first class, 
- class width, 
- number of classes. 


M - Тһе average number of successes over a given interval (i.e, the mean of the data). 
Result 
В - The explicit result produced by POISSON. It contains the Poisson probabilities for each class 


requested, as well as a value tacked оп 10 each end, this being the probability of each of the 
tails of the distribution. 


Example 

Taken from Dixon and Massey, page 352. See References. 
EXP-(0 1 5) POISSON .8 
EXP 

0 0.4493289641 0.3594631713 0.1437852685 0.03835273827 
0.007668547654 0.001411310146 

Notes and Hints 

1) The argument V must be made up entirely of integer values. 


2) A Poisson fit may not be meaningful if there are 15 or more classes. 


3) To generate probabilities for individual values, see the function POISSONPROB in this workspace; 
other related functions are POIPROBI, POIPROBC, POIPROB, POITABLE and POIVAL. 
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References 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Іпс., 1961. 


Dixon, W.J. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, 1957. 

Steel, R.G. and J.H. Torrie. Principles and Procedures of Statistics, McGraw-Hill, London, 1960. 
Source 

L. Gibson, S. Maxwell, S. Swaminathan 

Institute of Computer Science 


University of Guelph 
Guelph, Ontario 
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POISSONPROB 
Syntax 
R«M POISSONPROB X 
Description 
POISSONPROB computes probabilities for the Poisson distribution. 
Arguments 


M - The expectation of the distribution. 
X - The variable(s) in ascending order. 


Result 
Е - If the variable, X, is а scalar, the result is a 2-element vector containing: 
PROBABILITY («<Х) and PROBABILITY (X«x). 
If the variable is a vector of length n, the result is ап п+1 element vector, call it P, where: 
PL[I]- PROBABILITY (YLI]s*«YLI*1]) 
and 
Y=(-INFINITY), X, (+INFINITY). 
Examples 
PROB- .8 POISSONPROB 2 
PROB 
0.9525774039 0.04742259607 
PROB-.8 POISSONPROB 2 3 
PROB 
0.9525774039 0.03834273827 0.0090798578 


Notes and Hints 


1) The relative accuracy of the tai] probabilities is one part in 106. Accuracy of other probabilities 
is as good as if they had been computed by subtraction from tail probabilities. 


2) Invalid parameters are flagged with a DOMAIN ERROR. 


3) See the documentation for the functions POISSON, РОІРЕОВІ, POIPROBC, POIPROB, POITABLE 
and POIVAL in this workspace for alternative approaches. 
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References 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Inc., 1961. 


Dixon, W.J. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, 1957. 
Steel, R.G. and J.H. Torrie. Principles and Procedures of Statistics, McGraw-Hill, 1960. 
Source 

W. Knight 


University of New Brunswick 
Fredericton, N.B. 
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POITABLE 
Syntax 
N POITABLE M 
Description 
POITABLE prints a fully labelled table of Poisson individual and cumulative probabilities. 
Arguments 
М - Determines the size of the result matrix. The matrix has V+1 rows, if N is integer; otherwise 
it has ГУ rows, the last row representing LN. 
M - The mean of the desired Poisson distribution. 
Output 
A table in the following format is printed: 


TABLE FOR POISSON DISTRIBUTION WITH PARAMETER M. 


X | INDIVIDUAL CUMULATIVE 
TERM TERM 
0 0.ХХХХХХХХХ 0 .ХХХХХХХХХХ 


N 0.XXXXXXXXXX 0.ХХХХХХХХХХ 
Example 
Taken from Dixon and Massey, page 352. See References. 
5 POITABLE 0.8 


TABLE FOR POISSON DISTRIBUTION WITH PARAMETER 0.8 


X INDIVIDUAL CUMULATIVE 
TERM TERM 

0 0.449328964 0.449328964 
1 0.359463171 0.808792135 
2 0.143785269 0.952577404 
3 0.0383427383 0.990920142 
4 0.00766855765 0.99858869 
5 0.00122696762 0.999815657 
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Notes and Hints 


1) И the cumulative term gets within “fuzz” (OCT) of 1 before the line for N is printed, the table 
is terminated with the line 


REMAINING TERMS ARE INSIGNIFICANT. 
2) Error messages are printed if: 


- N is not a positive integer 
- M is not strictly greater than 0. 


3) This output cannot be stored or manipulated. The related function POZPROB produces a matrix 
of Poisson probabilities which can be stored and/or manipulated. Other Poisson functions avail- 
able in this workspace аге POISSON, POISSONPROB, POIPROBI, POIPROBC and POIVAL. 


References 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Inc., 1961. 


Dixon, W.J. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, 1957. 
Steel, К.С. and J.H. Torrie. Principles and Procedures of Statistics, McGraw-Hill, 1960. 
Source 


LCDR. J. Cook 
U.S. Navy 
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POIVAL 
Syntax 
ReP POIVAL M 
Description 
POIVAL computes the least integer greater than or equal to zero for which the cumulative Poisson 
probability exceeds the input probability, P. That is, it finds the smallest integer, X20, such that a 
random Poisson variable is less than or equal to X with probability greater than P. 


Arguments 


P - The input probability. 
M - The mean of the Poisson distribution. 


Result 
R - The least integer for which the required condition is satisfied. 
Example 
Xe.95 POIVAL .8 
x 
2 


Notes and Hints 


1) Since the Poisson requires M>0, and since P is a probability which should be such that 0<Р<1, 
this function sets R equal to 1 if 


MSO or P<O or Р>1 


2) Other related functions in this workspace are POISSON, POISSONPROB, POIPROBC, POIPROBI, 
POIPROB and POITABLE. 


References 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Inc., 1961. 


Dixon, W.J. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, 1957. 
Steel, К.С. and J.H. Torrie. Principles and Procedures of Statistics, McGraw-Hill, 1960. 
Source 


LCDR. J. Cook 
U.S. Navy 
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RUNSPROB 
Syntax 
ReN RUNSPROB X 
Description 
RUNSPROB computes the probability distribution for the number of runs. 
Arguments 
N - A 2-element vector containing the number of occurrences of the first symbol and of the second 
symbol, respectively. 
X - Numbers of runs in the sequence in ascending order. 
Result 
R - If the variable, X, is a scalar, the result is а 2-element vector containing: 
PROBABILITY (ж<Х) and PROBABILITY (X«*). 
If the variable is a vector of length n, the result is an n+1 element vector, call it P, where: 
PCIJ=PROBABILITY (YLI]e*«YLI*1]) 
and 
Y=(-INFINITY), X, (+INFINITY). 
Examples 
PROB+(10 10) RUNSPROB 7 
PROB 
0.05125679274 0.9487432073 
PROB+10 10 RUNSPROB 5 6 7 
PROB 
0.004492411613 0.01402931434 0.03273506679 0.9487432073 
Notes and Hints 


1) The relative accuracy of the tail probabilities is one part in 10*6. Accuracy of other probabilities 
is as good as if they had been computed by subtraction from tail probabilities. 


2) Invalid parameters are flagged with a DOMAIN ERROR. 


3) For a function that calculates the runs and performs the runs test, see the documentation for the 
function RUNSTEST in the workspace 31 NONPARAMETRIC. 
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Restriction 
The program handles strings of length 200 or less. 
Reference 


Siegel, S. Nonparametric Statistics for the Behavioural Sciences, McGraw-Hill, New York, 1956, 
Chapter 4. 


Source 
W. Knight 


University of New Brunswick 
Fredericton, N.B. 
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SIGNEDRANKPROB 
Syntax 
Я«М SIGNEDRANKPROB X 
Description 
SIGNEDRANKPROB computes probabilities for the signed rank test. 
Arguments 


X - Sum of positive ranks. 
N - The number(s) of ranks, in ascending order. 


Result 
R - If the variable, X, is a scalar, the result is а 2-element vector containing: 
PROBABILITY (ж<Х) and PROBABILITY (X<+). 
If the variable is a vector of length n, the result is an n+1 element vector, call it P, where: 
PLIJ]-PROBABILITY (YLI]s*«Y[I41]) 
and 
Y=(-INFINITY), X, (+INFINITY). 
Examples 
Taken from Conover, page 208. See Reference. 
PROB+11 SIGNEDRANKPROB 24.5 
PROB 
0.232421875 0.767578125 
PROB+11 SIGNEDRANKPROB 24.5 30 
PROB 
0.232421875 0.1831054688 0.5844726563 


Notes and Hints 


1) The relative accuracy of the tail probabilities is one part in 10*6. Accuracy of other probabilities 
is as good as if they had been computed by subtraction from tail probabilities. 


2) Invalid parameters are flagged with a DOMAIN ERROR. 
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Reference 

Conover, W.J. Practical Nonparametric Statistics, John Wiley & Sons Inc., 1971, Chapter 5. 
Source 

W. Knight 


University of New Brunswick 
Fredericton, N.B. 
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STUDENT 
Syntax 
ReDF STUDENT Р 
Description 


STUDENT calculates values from the Student t distribution when the probability and degrees of freedom 
are given. 


Arguments 


P - A probability (or vector of probabilities). 
DF - Тһе degrees of freedom (1 value only). 


Result 
Е - The resulting value(s) (1 per probability). 
Example 
7+10 STUDENT .90 .95 
Tu "m 1.81291134 
Notes and Hints 
1) This is the value found in t tables in most books. 
2) For the inverse of this function, see the documentation for TPROB in this workspace. 


References 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Inc., New York, 1961. 


Prins, J. New Paltz Library of Statistical Programs, Fourth Edition, State University College, New 
Paltz, N.Y., April 1972. 


Source 
J. Prins 


State University College 
New Paliz, N.Y. 
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TPROB 
Syntax 
ReN TPROB Т 
Description 
TPROB computes probabilities for the t distribution. 
Arguments 


N  - Degrees of freedom. 
T - The variable(s), in ascending order. 


Result 


В - Contains probabilities based on the t distribution. If the variable, 7, is а scalar, the result is 
а 2-element vector containing: 


PROBABILITY (*<Т) and PROBABILITY (T«*). 
If the variable is a vector of length n, the result is an n+1 element vector, call it P, where: 
PLIJ=PROBABILITY (¥(I]s*<¥[I+1]) 
and 
Y=(-INFINITY), T, (+INFINITY). 
Examples 
РКОВє10 TPROB 1.372389311 
PROB 
0.9000309823 0.09996901771 
РКОВе10 TPROB 1.372 1.812 2.228 
PROB 
0.8999723293 0.04999003967 0.02503174512 0.02500588591 
Notes and Hints 


1) The relative accuracy of the tail probabilities is one part in 10*6. Accuracy of other probabilities 
is as good as if they had been computed by subtraction from tail probabilities. 


2) Invalid parameters are flagged with a DOMAIN ERROR. 
3) For an inverse of this function, see the documentation for STUDENT in this workspace. 


4) Тһе workspace 31 TTEST contains functions for performing the t-test. 
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Reference 


Brownlee, K.A. Statistical Theory and Methodology in Science and Engineering, John Wiley & 
Sons Inc., 1961. 


Source 
W. Knight 


University of New Brunswick 
Fredericton, N.B. 
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Function 


Header 


ReX DISCRETE (N,P) 


ReNCAUCHY N 


ReNEXPONENTIAL N 


ReNEXTREME № 


ReNLAPLACE N 


ReNLOGISTIC N 


ReNNORMAL П 


Re-NUNIFORM N 


ReP PBETA 9 


В«М PBINOMIAL P 


RePCHISQUARE M 


ReN1 РР №2 


RePGAMMA Р 


RePGEOMETRIC X 


ВК PHYPER M GEOMETRIC 8 


33 RANDOM 


FUNCTIONS 


Documentation 


DISCRETEHOW 


NCAUCHYHOW 


NEXPONENTIALHOW 


NEXTREMEHOW 


NLAPLACEHOW 


NLOGISTICHOW 


NNORMALHOW 


NUNIFORMHOW 


PBETAHOW 


PBINOMIALHOW 


PCHISQUAREHOW 


PFHOW 


PGAMMAHOW 


PGEOMETRICHOW 


PHYPERHOW 


243 


Description 


Generates random numbers from a 
user-defined discrete distribution. 


Generates random numbers from the 
Cauchy distribution. 


Generates random numbers from the 
exponential distribution. 


Generates random numbers from the 
type I extreme value distribution. 


Generates random numbers from the 
Laplace distribution. 


Generates random numbers from the 
logistic distribution. 


Generates random numbers from the 
normal distribution. 


Generates random numbers from the 
uniform distribution. 


Generates random numbers from the 
beta distribution. 


Generates random numbers from the 
binomial distribution. 


Generates random numbers from the 
chi-square distribution. 


Generates random numbers from the F 
distribution. 


Generates random numbers from the 
gamma distribution. 


Generates random numbers from the 
geometric distribution. 


Generates random numbers from the 
hypergeometric distribution. 


ReM PLOGNORMAL 5 


ReM PNEGBIN X 


Res PNON Ni CENTRALF N2 


Red PNONCENTCHISQ M 


Re^ PNONCENTRALT M 


R+PPOISSON M 


RePT М 


RANDOMIZE 


PLOGNORMALHOW 


PNEGBINHOW 


PNONHOW 


PNONCENTCHISQHOW 


PNONCENTRALTHOW 


PPOISSONHOW 


PTHOW 


RANDOMIZEHOW 
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Generates random numbers from 
lognormal distribution. 


Generates random numbers from 
negative binomial distribution. 


Generates random numbers from 
non-central F distribution. 


Generates random numbers from 


non-central chi-square distribution. 


Generates random numbers from 
non-central t distribution. 


Generates random numbers from 
Poisson distribution. 


the 


the 


the 


the 


the 


the 


Generates random numbers from the t 


distribution. 


Sets random number generator seed to 


a random value. 


DISCRETE 
Syntax 
ReX DISCRETE (N,P) 
Description 


DISCRETE generates random numbers from a discrete, user-defined probability density function. 


Arguments 

X - a vector of the value set of the discrete distribution. 

N - the number of random numbers to be generated. 

P - a vector whose length is the same as that of X, and whose values are greater than or equal 


to zero. It can be either: 


1) a set of probabilities for the values in X, or 
2) a frequency distribution of X. 


Result 
R - а vector of N random numbers from X following the distribution governed by P. 
Example 


2 3 4 5 DISCRETE 5 .2 .25 .3 .25 
254 4 3 


Notes and Hints 


1) This function is quite useful in applications where there is a history of data for the random 
variable; it can be incorporated into simulation programs in such areas as inventory control, 
production management, and queueing systems, to name a few. 


2) The system variable ORL should be used to set the seed for random number generation. The 
default value of ОВР is 16807, and will be changed each time ?, the primitive random number 
generator, is used 


Source 
F. Arthur 


I.P. Sharp Associates Limited 
Calgary, Alberta 


NCAUCHY 
Syntax 
ReNCAUCHY N 
Description 
NCAUCHY generates random numbers from the Cauchy distribution with density 
0.5+(ARCTAN X)#3.141592654 
Argument 
N - Determines the size and shape of the result. 
If ва scalar (say 4), 4 random numbers would be generated. 
If ва vector (say 3 4), 12 numbers would be generated in a 3x4 matrix. 
Result 
R - Тһе random number(s). 
Example 


ReNCAUCHY 2 3 


R 
70.768032%971 0.5781450089 2.030990664 
70.658232229 72.044240395 0.3327505411 


Notes and Hints 

The system variable ORZ should be used to set the seed for random number generation. The default 
value of ORL is 16807, and will be changed each time ?, the primitive random number generator, is 
used. 


Reference 


Wheeler, К.Е. “Random Variable Generators”, APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
R.E. Wheeler 


E.I. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 
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NEXPONENTIAL 
Syntax 
R-NEXPONENTIAL М 
Description 


NEXPONENTIAL generates random number(s) from the exponential distribution with the density: 
EXP( -X). 


Argument 
N - Determines the size and shape of the result. 
If W is a scalar (say 4), 4 random numbers would be generated. 
If N is а vector (say 3 4), 12 random numbers would be generated in a 3x4 table. 
Result 
R - The random number(s). 


Example 


ReNEXPONENTIAL 2 3 


R 
5.187098495 0.08248034943 0.9624526577 
0.6606190581 1.065790275 1.331300787 


Notes and Hints 

The system variable ORZ should be used to set the seed for random number generation. The default 
value of DRZ is 16807, and will be changed each time ?, the primitive random number generator, is 
used. 


Reference 


Wheeler, К.Е. “Random Variable Generators”, APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
КЕ. Wheeler 


E.l. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 
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NEXTREME 
Syntax 
ReNEXTREME N 
Description 


NEXTREME generates random numbers from the Туре I extreme value distribution, with distribution 
function EXP. (-ЕХР(-Х)). 


Argument 
N  - Determines the size and shape of the result. 
If М is a scalar (say 4), 4 random numbers would be generated. 
If Ņ is a vector (say 3 4), 12 random numbers would be generated in a 3x4 matrix. 
Result 
R - The random number(s). 


Example 


ReNEXTREME 2 2 


R 
70.2106115102 1.70405714 
70.7354464052 0.788051907 


Notes and Hints 

The system variable DRZ should be used to set the seed for random number generation. The default 
value of DRZ is 16807, and will be changed each time ?, the primitive random number generator, is 
used. 


Reference 


Wheeler, К.Е. “Random Variable Generators", APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
R.E. Wheeler 


E.l. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 
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NLAPLACE 
Syntax 
ReNLAPLACE N 
Description 


NLAPLACE generates random number(s) from the Laplace distribution, with density 
0.5хЕХР(-АВ5 X) 


Argument 
N - Determines the size and shape of the result. 
ІМ is a scalar (say 4), 4 random numbers would be generated. 
If N is a vector (say 3 4), 12 random numbers would be generated in a 3x4 matrix. 
Result 
R - The random number(s). 
Example 
ReNLAPLACE 4 
R 
2.742223746 0.3742801091 3.%86753635 1.629151174 
Notes and Hints 
The system variable DRZ should be used to set the seed for random number generation. The default 
value of DRZ is 16807, and will be changed each time ?, the primitive random number generator, is 
used. 


Reference 


Wheeler, R.E. "Random Variable Generators", APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
R.E. Wheeler 


E.l. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 
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NLOGISTIC 
Syntax 
R-NLOGISTIC N 
Description 


NLOGISTIC generates random number(s) from the logistic distribution, with distribution function 
(1+ЕХР(-Х))ж-1 


Argument 
N  - Determines the size and shape of the result. 
If У is a scalar (say 4), 4 random numbers would be generated. 
If N is a vector (say 3 4), 12 random numbers would be generated in a 3x4 matrix. 
Result 
В - The random number(s). 
Example 
ReNLOGISTIC 5 
R 
0.2448740077 70.9019279561 0.3410252118 70.7509638088 
0.05837776956 
Notes and Hints 
The system variable ORZ should be used to set the seed for random number generation. The default 
value of QRZ is 16807, and will be changed each time ?, the primitive random number generator, is 
used. 


Reference 


Wheeler, К.Е. “Random Variable Generators”, APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
R.E. Wheeler 


E.I. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 
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NNORMAL 
Syntax 
ReNNORMAL N 
Description 
NNORMAL generates random numbers from the normal distribution with zero mean and unit variance. 
Argument 
N - Determines the number of random numbers in the result. 


If N is a scalar (say 4), 4 random numbers would be generated. 
If N is a vector (say 3 4), 12 random numbers would be generated іп a 3x4 matrix. 


Result 
R - The random number(s). 
Example 
R¢NNORMAL 3 4 
R 
71.13106236 70.122960874 0.690083969 1.32421146 
0.0921959714 0.427113098 70.736787829 71.33238546 
1.08876157 0.576134859 70.98917729 0.947203074 


Notes and Hints 


1) Remember that normal deviates with mean M and standard deviation 3 can be generated from 
В (the result of NWORMAL) by M+SxR. 


2) Тһе system variable QRZ should be used to set the seed for random number generation. The 
default value of [JRL is 16807, and will be changed each time ?, the primitive random number 
generator, is used. 


Methodology 


The function uses the polar generator algorithm of Box and Muller. 
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References 


Box, С. and M. Muller. “А note on the generation of random normal deviates", Annals of Mathema- 
tical Statistics, 29, pp. 610-611. 


Wheeler, К.Е. “Random Variable Generators", APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
R.E. Wheeler 


E.I. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 


NUNIFORM 
Syntax 
ReNUNIFORM № 
Description 
NUNIFORM generates random number(s) from the uniform distribution on (0,1). 
Argument 
N  - Determines the size and shape of the result. 


If W is а scalar (say 4), 4 random numbers would be generated. 
If ва vector (say 3 4), 12 random numbers would be generated іп a 3x4 matrix. 


Result 
R - The random number(s). 
Example 
ReNUNIFORM 3 2 
R 
0.5959271915 0.748311071 
0.864171537 0.131023061 
0.1045793605 0.6653081085 


Notes and Hints 

The system variable ORZ should be used to set the seed for random number generation. The default 
value of DRZ is 16807, and will be changed each time ?, the primitive random number generator, 
is used. 


Reference 


Wheeler, R.E. *Random Variable Generators", APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
К.Е. Wheeler 


ЕЛ. duPont de Nemours 8 Co. Inc. 
Wilmington, Delaware 
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PBETA 
Syntax 
ВР PBETA Q 
Description 
PBETA generates random number(s) from the beta distribution. 
Arguments 
P and Q - Parameters defined according to the density: 
(+В(Р,9))х(ХжР-1)х(1-Х)ж9-1 


where B(P,Q) is the complete beta function. Р and Q сап be any arrays of matching 
shape; alternatively, one or both may be scalars. 


Result 
В - Тһе random number(s). The shape of R will be scalar if P and 0 are both scalar. Otherwise 
it will have the shape of the non-scalar arguments(s), with each number in the result corre- 
sponding to the appropriate values in that position of the argument(s). 
Example 
Re3 PBETA 4 5 7 
R 
O.4340549441  0.7517197598 0.1777338161 
Notes and Hints 
1) This function is defined and efficient for all real values such that P and Q are greater than 0. 
2) Тһе system variable QRZ should be used to set the seed for random number generation. The 
default value of ORZ is 16807, and will be changed each time ?, the primitive random number 
generator, is used. 


References 


Bekessy, A. "Remarks on beta distributed random numbers", Pub. of the Math. Inst., Hungarian 
Academy of Science, Series A, Vol. 9, pp. 565-571. 


Johnk, M.D. “Erzeugung von betaverteilten und gammaverteilten Zufallszahlen" Metrika, Vol. 8, 
рр. 5-15. 


Wheeler, К.Е. *Random Variable Generators", АРІ. Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 
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Source 


R.E. Wheeler 
E.I. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 
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PBINOMIAL 
Syntax 
В«М PBINOMIAL Р 
Description 


PBINOMIAL generates random number(s) from the binomial distribution. 


Arguments 
M - The number of trials. 
P - The probability of success in a trial. 


M and P can be any arrays of matching shape. Alternately one or both can be scalars. 

Result 

В - The random number(s). The shape of Я will be scalar if M and P are both scalar. Otherwise 
it will have the shape of the non-scalar argument(s), with each number in the result corre- 
sponding to the appropriate values in that position of the argument(s). 


Examples 


R+10 PBINOMIAL .3 


R 

1 
Re10 20 40 PBINOMIAL .5 
R 

3.8 21 
Re10 20 PBINOMIAL .5 .6 
R 

6 13 


Notes and Hints 

The system variable (JRL should be used to set the seed for random number generation. The default 
value of QRZ is 16807, and will be changed each time ?, the primitive random number generator, 
is used. 


Reference 


Wheeler, R.E. *Random Variable Generators", APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
R.E. Wheeler 


E.L duPont de Nemours & Co. Inc. 
Wilmington, Delaware 
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PCHISQUARE 
Syntax 
RePCHISQUARE M 
Description 


PCHISQUARE generates random number(s) from the chi-square distribution. 


Argument 

M - Specifies the degrees of freedom. M can be any shape array. 
Result 

R - The random number(s). Е will be the same shape as М. 
Example 


RePCHISQUARE 15 25 
22 анаа 27.83171161 
Notes and Hints 
The system variable [JRL should be used to set the seed for random number generation. The default 
ра ПАГ is 16807, and will be changed each time ?, the primitive random number generator, 


Reference 


Wheeler, R.E. “Random Variable Generators”, APL Quote Quad, Voi. 4, No. 3, April 1973, pp. 
7-16. 


Source 
R.E. Wheeler 


E.l. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 
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PF 
Syntax 
В+Й1 PF №2 
Description 
PF generates random number(s) from the Е distribution. 
Arguments 


М1 and N2 - Specify the degrees of freedom. М1 and N2 can be any arrays of matching shape. 
Alternately one or both can be scalars. 


Result 
R - The random variable(s). The shape of В will be scalar if 771 and N2 are both scalar. Otherwise 


it will have the shape of the non-scalar argument(s), with each number in the result corre- 
sponding to the appropriate values in that position of the argument(s). 


Examples 
10 PF 30 40 50 
1.7231316315 0.6035207379 0.8030013572 
3 РЕ 10 
1.180109959 
Notes and Hints 
The system variable [RZ should be used to set the seed for random number generation. The default 
value of ORE is 16807, and will be changed each time ?, the primitive random number generator, 
is used. 


Reference 


Wheeler, R.E. “Random Variable Generators”, APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
R.E. Wheeler 


E.I. duPont de Nemours & Co. Ltd. 
Wilmington, Delaware 
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PGAMMA 
Syntax 
RePGAMMA P 
Description 
PGAMMA generates random number(s) from the gamma distribution. 
Argument 
P - The parameter defined according to the density 
(+САММА(Р) )х(Х*Р-1) x*-X. 
P may Бе апу shape array. 
Result 
R - The random number(s). R will have the same shape as Р. 
Example 
RePGAMMA 4 .5 .7 
0. ene 0.1494028865 0.01709296069 
Notes and Hints 


1) This function is defined and efficient for all real values such that Р>0. 


2) The system variable ORZ should be used то set the seed for random number generation. The 
default value of QRZ is 16807, and will be changed each time ?, the primitive random number 


generator, is used. 


References 


Bekessy, A. “Remarks on beta distributed random numbers", Pub. of the Math. Inst., Hungarian 


Academy of Science, Series A, Volume 9, pp. 565-571. 


Johnk, M.D. “Erzeugung von betaverteilten und gammaverteilten Zufallszahlen", Metrika, Volume 


8, pp. 5-15. 


Wheeler, R.E. *Random Variable Generators". APL Quote Quad, Volume 4, No. 3, April 1973, 


pp. 7-16. 
Source 
R.E. Wheeler 


ЕЛ. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 
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PGEOMETRIC 
Syntax 
RePGEOMETRIC X 
Description 
PGEOMETRIC generates random number(s) from the geometric distribution. 
Argument 


X - The parameter X=(1-P)+P, where P is the probability of success at each trial. 
X may be any shape array. 


Result 
R - The random number(s). В will have the same shape as X. 
Example 
RePGEOMETRIC 2 4 5 
R 
14 0 5 
Notes and Hints 
1) This gives the waiting time for the first failure when P is the probability of success at each trial. 
2) Тһе system variable OR} should be used to set the seed for random number generation. The 
default value of GRZ is 16807, and will be changed each time ?, the primitive random number 
generator, is used. 
References 
Greenwood, M. and G.M. Tule. *An Inquiry into the Nature of Frequency Distributions of Multiple 
Happenings, With Particular Reference to the Occurrence of Multiple Attacks of Disease or Repeated 
Accidents”, Journal of the Royal Statistics Society, Series A, Vol. 83, 1920, рр. 255-279. 


Wheeler, К.Е. “Random Variable Generators”, APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
R.E. Wheeler 


E.I. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 
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PHYPER 
Syntax 
В<К PHYPER M GEOMETRIC S 
Description 


PHYPER and GEOMETRIC together generate random number(s) from the hypergeometric distribution. 


Arguments 

К - Sample size. 

M - Number of special items. 
S  - Population size. 


K, M, and S must be arrays of the same shape. Alternately, any or all may be scalars. 

Result 

В - The random number(s). The shape of Я will be scalar if all of M, X, and 5 are scalar. 
Otherwise it will have the shape of Ше non-scalar argument(s), with each number in the result 
corresponding to the appropriate values in that position of the argument(s). 


Examples 


15 PHYPER 20 GEOMETRIC 50 100 500 


6 4 1 
15 20 50 PHYPER 20 GEOMETRIC 75 
2 6 10 


Notes and Hints 


The system variable GRZ should be used to set the seed for random number generation. The default 
value of [IRL is 16807, and will be changed each time ?, the primitive random number generator, 
is used. 


Reference 


Wheeler, R.E. *Random Variable Generators", APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
R.E. Wheeler 


ЕЛ. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 


PLOGNORMAL 
Syntax 
RM PLOGNORMAL S 
Description 


PLOGNORMAL generates random number(s) from the lognormal distribution. 


Arguments 
M - The mean of the associated normal distribution. 
S  - The standard deviation of the associated normal distribution. 


M and S must be arrays of the same shape. Alternately one or both may be scalars. 

Result 

R - Тһе random number(s). The shape of R will be scalar if M and 5 are both scalar. Otherwise 
it will have the shape of the non-scalar argument(s), with each number in the result corre- 
sponding to the appropriate values in that position of the argument(s). 


Examples 


Reo PLOGNORMAL 1 
R 
0.534147829 


Reo PLOGNORMAL .5 1 2 
0 sab ЫЫЫ? 1.475243883 0.8836634425 

Н«0 1 PLOGNORMAL 1 .5 
0 „ера 1.528361283 
Notes and Hints 
The system variable QPZ should be used to set the seed for random number generation. The default 
value of DRL is 16807, and will be changed each time ?, the primitive random number generator, 
is used. 


Reference 


Wheeler, К.Е. “Random Variable Generators", APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
R.E. Wheeler 


ЕЛ. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 
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PNEGBIN 
Syntax 
В«М PNEGBIN X 
Description 
PNEGBIN generates random number(s) from the negative binomial distribution. 
Arguments : 
M and X - are parameters according 10 the expansion 
(S-X)*-M where 5=1+Х 
M and X can be any arrays of matching shape. Alternately one or both can be scalars. 
Result 
В - The random number(s). The shape of R will be scalar if M and X are both scalar. Otherwise 
it will have the shape of the non-scalar argument(s), with each number in the result corre- 
sponding to the appropriate values in that position of the argument(s). 
Examples 


Re10 PNEGBIN .3 .5 .7 


R 
г 6 iT 
R+10 20 PNEGBIN .25 
R 
4 5 
Re10 20 PNEGBIN .25 .4 
R 
0 7 


Notes and Hints 


1) One interpretation of PNEGBIN is the number of trials until M failures, when P is the probability 
of success at each trial and Е-(1-Р) +Р. 


2) Тһе system variable DRZ should be used to set the seed for random number generation. The 
default value of ПЕТ, is 16807, and will be changed each time ?, the primitive random number 
generator, is used. 


Methodology 


The function is obtained as the mixture of Poisson random variables with parameters following a 
gamma distribution. 
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References 

Greenwood, M. and G.M. Tule. “Ап Inquiry into the Nature of Frequency Distributions of Multiple 
Happenings, With Particular Reference to the Occurrence of Multiple Attacks of Disease or Repeated 
Accidents”, Journal of the Royal Statistics Society, Series A, Vol. 83, 1920, pp. 255-279. 


Wheeler, R.E. *Random Variable Generators", APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
R.E. Wheeler 


E.I. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 
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PNON 

Syntax 
Re& PNON М1 CENTRALF N2 
Description 
PNON and CENTRALP together generate random number(s) from the non-central Е distribution. 
Arguments 
^  - Тһе noncentrality parameter of the form: 

А=(+/М0ж2)ж.5 


М1 and М2 - Degrees of freedom. 
A, N1, and N2 must be arrays of the same shape. Alternately any or all may be scalars. 


Result 


В - The random number(s). The shape of R will be scalar if all of A, М3, and М2 are scalar. 
Otherwise it will have the shape of the non-scalar argument(s), with each number in the result 
corresponding to the appropriate values in that position of the argument(s). 


Examples 


Rey 6 10 PNON 18 CENTRALF 20 
R 

2.522451871 2.489024266 4.892260219 
Rey PNON 15 CENTRALF 12 
R 

4.25360433 
Rey 7 PNON 15 11 CENTRALF 12 15 
R 

2.490790375 3.007511168 


Notes and Hints 

The system variable [RL should be used to set the seed for random number generation. The default 
value of DRL is 16807, and will be changed each time ?, the primitive random number generator, 
is used. 


Reference 


Wheeler, R.E. *Random Variable Generators", APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
R.E. Wheeler 


E.l. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 
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PNONCENTCHISQ 
Syntax 
Red PNONCENTCHISQ M 
Description 


PNONCENTCHISQ generates random number(s) from the non-central chi-square distribution. 


Arguments 
M - Degrees of freedom. 
A - The noncentrality parameter of the form: 


Az(*/MU*2)*.5 
M and А can be any arrays of matching shape. Alternately one or both can be scalars. 
Result 


В - The random variable(s). The shape of Я will be scalar if ^ and М are both scalar. Otherwise 
it will have the shape of the non-scalar argument(s), with each number in the result corre- 
sponding to the appropriate values in that position of the argument(s). 


Examples 


Re12 PNONCENTCHISQ 25 
R 
138.091278 
Rey PNONCENTCHISQ 20 14 
R 
52.69400352 30.35550693 
Reu 12 PNONCENTCHISQ 12 13 
R 
39.30444225 143.6920474 


Notes and Hints 

The system variable ОРІ should be used 10 set the seed for random number generation. The default 
value of ORL is 16807, and will be changed each time ?, the primitive random number generator, 
is used. 


Reference 


Wheeler, R.E. *Random Variable Generators", APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
R.E. Wheeler 


ЕЛ. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 
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PNONCENTRALT 
Syntax 
Ren PNONCENTRALT М 
Description 
PNONCENTRALT generates random number(s) from the non-central t distribution. 
Arguments 


A - The constant added to the normal variable. 
M - Тһе degrees of freedom. 


M and А can be any arrays of matching shape. Alternately one ог both can Бе scalars. 
Result 


R - The random number(s). The shape of R will be scalar if A and M are both scalar. Otherwise 
it will have the shape of the non-scalar argument(s), with each number in the result corre- 
sponding to the appropriate values in that position of the argument(s). 


Examples 


Re3.5 PNONCENTRALT 10 20 
R 
1.661925637  3.685517827 
Re3.5 6.5 PNONCENTRALT 10 
R 
5.714995251 7.153431082 
Re3.5 6.5 PNONCENTRALT 10 20 
R 
2.762488999 7.037532844 


Notes and Hints 

The system variable [JRL should be used to set the seed for random number generation. The default 
value of DRL is 16807, and will be changed each time ?, the primitive random number generator, 
is used. 


Reference 


Wheeler, R.E. “Random Variable Generators", APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
R.E. Wheeler 


E.I. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 
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PPOISSON 

Syntax 
RePPOISSON M 
Description 
PPOISSOI generates random number(s) from the Poisson distribution 
Argument 
M - Specifies the mean. М may be any shape array. 
Result 
В - Тһе random number(s). The shape of В will match that of M. 
Example 

RePPOISSON 5 5 

R 
о 1 
Notes and Hints 
"The system variable QRZ should be used to set the seed for random number generation. The default 
yae ді URL is 16807, and will be changed each time ?, the primitive random number generator, 


Reference 


Wheeler, К.Е. “Random Variable Generators”, APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
КЕ Wheeler 


ЕЛ. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 
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Syntax 
RePT M 
Description 


PT generates random number(s) from the t distribution. 


Argument 
M - Specifies the degrees of freedom. M can be any shape array. 
Result 
R - The random number(s). (Same shape as the argument M.) 
Example 

RePT 3 14 

R 


71.471104919 0.203556957 

Notes and Hints 

Тһе system variable DRZ should be used to set the seed for random number generation. The default 
value of (RL is 16807, and will be changed each time ?, the primitive random number generator, 
is used. 


Reference 


Wheeler, R.E. “Random Variable Generators", APL Quote Quad, Vol. 4, No. 3, April 1973, pp. 
7-16. 


Source 
R.E. Wheeler 


Е.І. duPont de Nemours & Co. Inc. 
Wilmington, Delaware 
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RANDOMIZE 
Syntax 
RANDOMIZE 
Description 


The effect of invoking the function RANDOMIZE is to establish a new random link for the APL random 
number generator. 


Methodology 


The APL random number generator produces a set of random numbers by calculating a sequence 
of numbers from a particular starting point, called the link. This link is a workspace characteristic, 
so that every time 33 RANDOM is loaded, the same seed will be in effect. RANDOMIZE calculates a new 
seed based on variables which hopefully are reasonably randomly distributed, according to the follow- 
ing formula: 


DRL-(2x31)| (1.11659 3 


6 БЕЧ 1Е3 1)*.x(50 | LUZ) „СТ: 


657] 


Notes and Hints 


CRE is ап APL system variable, random link, whose value is the current random link in the workspace. 
To achieve reproducibility of result, store the value of СУГ after executing VIZE and set |]: 
to this value before subsequent uses 


Source 


Lib Gibson 
LP. Sharp Associates Limited 
Calgary, Alberta 


LIBRARY 34 


ANALYSIS OF VARIANCE 


Workspaces: ANOVA 


COVARIANCE 


34 ANOVA 


FUNCTIONS 

Function 

Header Documentation Description 

ANOVA1 ANOVA1HOW Crossed or nested design analysis of 
variance. 

ANOVA2 D ANOVA2HOW One-way analysis of variance with 
unequal subclass numbers. 

LATINSQC LATINSQCHOW Conversational function for analysis of 
variance with a Latin square design. 

DESIGN LATINSQP DATA LATINSQPHOW Non-conversational function for analy- 
sis of variance with a Latin square de- 
sign. 

MEANSP DATA MEANSPHOW Profile analysis and/or treatment com- 
bination means tables for factorial ex- 
periments. 

NWAYANOVAC AWAY ANOV ACHOW Conversational function for analysis of 
variance with completely random ог 
randomized complete block designs. 

NWAYANOVAP DATA NWAY ANOV APHOW Non-conversational function for analy- 
sis of variance with completely random 
or randomized complete block designs. 

SPLITPLOTC РАТА SPLITPLOTCHOW Conversational function for analysis of 
variance with split or split-split plot de- 
signs. 


ANOVAL 
Syntax 
ANOVA1 
Description 
This function analyzes a factorial design with no missing data as a crossed, nested, or cross-nested 
design. Replications are considered as a factor; there may be any number 22 of levels of any number 
>2 of factors. 
Input 
If the data has not been stored as a multidimensional array in the global variable X, then enter a 
vector giving the number of levels of each factor (including replications) as the parameters. Then enter 
observations, one at a time, in the format: level of 1st factor, level of 2nd factor,..., level of last factor, 
observation. After the last observation has been entered, enter 0. 
After the grand mean and total degrees of freedom and sum of squares have been typed out, enter 
effects, one at a time, with 1-factor effects first, 2-factor effects second, and so on. To designate the 
first effect, simply type 1, for the second, type 2; for the interaction of the first and second effects, 
type 1 2. 
Output 
Effect 1, followed by effect 2, followed by effect 1 2 gives the main effects for factors 1 and 2 and 


their interaction. Effect 1 followed by effect 1 2 will give main ейес for factor 1, then 2nd factor 
nested within 1st factor. An effect of 0 produces any residual term. 
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Example 


Taken from Cochrane and Cox, page 226. See References. 


42 46 47 39 53 42 
47 29 35 47 57 45 
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3 15 6 
ANOVA1 
ENTER PROBLEM NUMBER. 
a: 
T 
DO YOU WISH TO ENTER PARAMETERS AND DATA? 
NO 
PROBLEM NUMBER 1 DATE 7 10 78 
GRAND MEAN 32.1222222 
TOTAL DF AND SS 269 18142.9667 
D: 
2 
EFFECT DF, AND MS 14 10204. 2444 728.874602 
AND MS 2 135.088889 57.5444444 


EFFECT DF,SS ¢ 


2: 


28 1198.46 


42.802361 


EF. 


210 L298. 56859 


Notes and Hints 


1) The residuals are stored in the global variable XX. 
2) ‘Phe following is an illustration of how to structure the global variable X: 


44404 ме Е 10 


а 70% acy d$ 


where the columns within each square refer to the first factor А and the rows to the second factor 
B. The data should be prepared as a vector: 1, 2, 3...12, and then restructured into 2 with 
dimensions (3 2). If it is desired to treat the design as а 3x2x2 factorial with a single 
replication, then 2 must be restructured to have dimensions (1, 3, 2, 2) 


References 
Cochrane, W.G. and G.M. Cox. Experimental Designs, John Wiley & Sons Гас. London, 1950 


Smillie, K.W. Statpack 2: An APL Statistical Package. Publication No. 17. Dept. of Computing 
Science, University of Alberta, 1969. 
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Source 


Dr. K.W. Smillie 

Department of Computing Science 
University of Alberta 

Edmonton, Alberta 
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ANOVA2 
Syntax 
ANOVA2 D 
Description 


ANOVA2 performs an analysis of variance on a one-way classification. with unequal numbers of 
observations on each treatment. 


Argument 

D - Аһ MXxN matrix, where M is the number of observations for the treatment with the maximum 
number of observations, and N is the number of treatments. Each column of D contains the 
positively-valued observations of a treatment, with zeros filling out any extra cells. 

Output 


The following tables are printed: 


1) An analysis of variance table providing source of variation, degrees of freedom, sums of squares, 
mean squares, F-ratio and probability level. 


2) Sample size and mean for each treatment. 


3) The results of Scheffe's means test on all treatment combinations. Differences between means and 
the F-ratios and probability levels are printed. 


Example 


Taken from Dixon and Massey, page 149. Зее References. 


D 
1:8: 8 7 
24 4 4 
4 6 5 3 

ANOVA? р 

ANALYSIS OF VARIANCE 

SOURCE. DE. .8g 
TREATMENTS 3 3.3333 1.1111 0.2721 0.843929 
ERROR 8 32.6667 4.0833 
TOTAL 11 36.0000 
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—HQ . SIZE. |... MEAN 


1 3 4.3333 
2 3 5.3333 
3 3 5.6667 
4 3 4.6667 

SCHEFFE'S MEANS TEST 

= СОМВ — = ox DIFF... — FP  < PLEVEL 
as. <22 71.0000 0.1224 0.954255 
1 3 71.3333 0.2177 0.881543 
10% 70.3333 0.0136 0.997648 
2 3 70.3333 0.0136 0.997648 
2 ц 0.6667 0.0544 0.982097 
з у 1.0000 0.1224 0.944255 


Notes and Hints 


This function cannot be used if there are actual values of zero in the data as they will be interpreted 
as missing values. 


References 
Dixon, W.J. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, Toronto, 1957. 


Smillie, K.W. Statpack 2: An APL Statistical Package, Publication No. 17, Dept. of Computing 
Science, University of Alberta, 1969. 


Source 
Dr. K.W. Smillie 
Department of Computing Science, 


University of Alberta 
Edmonton, Alberta 
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LATINSQC 
Syntax 
LATINSQC 
Description 


LATINSQC is an interactive function which performs analysis of variance on a 3-factor experiment 
with a Latin square design. 


Input 

The user is asked conversationally to enter 

1) The number of levels of replications. 

2) The number of levels of the 3 factors. This number must be greater than 2 and less than 53. 


3) Each replication of the experimental data, entered as 
ROW1,ROW2,ROW3,...,ROW M 


4) Each row of the Latin square design as permutations of the first m letters of the alphabet (and 
the letters underscored, if necessary), where m is the number of levels of the factors. 


Output 


An analysis of variance table is produced giving the source, sums of squares, degrees of freedom, mean 
squares, F-ratios, and the probability levels of the F-ratios (probability of obtaining a higher F-value). 


Example 


Taken from Steel and Torrie, pages 148-150. See References. 


LATINSQC 
ENTER THE NUMBER OF REPLICATIONS. 
0: 
1 
ENTER THE NUMBER ОЕ ROWS. 
D: 
4 
YOU HAVE 4 ROWS, 4 COLUMNS, AND 4 TREATMENTS WITH 1 REPLICATION(S). 
NOW ENTER EACH REPLICATION OF THE DATA, IN THE FORM ROW 1, ROW 2, ..., ROW 4 
ENTER REPLICATION 1 
0: 
10.5 7.7 12 13.2 11.1 12 10.3 7.5 5.8 12.2 11.2 13.7 
11.6 12.3 5.9 10.2 
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ENTER THE ROWS OF THE LATIN SQUARE DESIGN AS PERMUTATIONS OF A BC D 
SEPARATE THE ROWS FROM EACH OTHER BY BLANKS, IF DESIRED. 
CDBABACDDCABABDC 

ALIGN PAPER 


ANALYSIS OF VARIANCE -- LATIN SQUARE DESIGN 


SOURCE DF SS MS F-RATIO PLEVEL 
ROWS 3 1.9550 0.6517 1.4375 0.3220 
COLUMNS 3 6.8000 2.2667 5.0000 0.0557 
TRTMNTS 3 78.9250 26.3083 58.0331 0.0002 
ERROR 6 2.7200 0.4533 

TOTAL 15 90.4000 


Notes and Hints 

1) If your Latin square design and data are already stored in the computer and аге of the correct 
APL data forms (MxM character matrix and REPxMxM numeric matrix, respectively, for a 3-factor 
experiment with M levels for each factor and REP levels of replications), use 

DESIGN LATINSQP DATA 

to get the same output (cf. LATINS@P in this workspace). 

2) In entering the replications of your data matrix, if the entries require more than 1 line, use 

ENT1,ENT2,...,ENTN,O 


and then continue after hitting carriage return. 


3) For an experiment with m levels for each factor, the first m characters of 


ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ are used for the Latin 
square design. 


References 

Dixon, W.J. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, Toronto, 1957. 
Keeping, E.S. Introduction to Statistical Inference, D. Van Nostrand, Toronto, 1962. 

Steel, R.G. and J.H. Torrie. Principles and Procedures of Statistics, McGraw-Hill, Toronto, 1960. 
Source 

R. Hui 


LP. Sharp Associates Limited 
Calgary, Alberta 
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LATINSQP 

Syntax 

DESIGN LATINSQP DATA 

Description 

LATINSQP performs analysis of variance on a 3-factor experiment with a Latin square design. 

Arguments 

(For an experiment with М levels of each factor and REP replications.) 

DESIGN- The Latin square design as an MxM character matrix. Each row (and column) of the design 
must be a permutation of the first M letters of the alphabet and the letters underscored, 
if necessary. 

DATA - Тһе experimental data as a REPxMxM numeric matrix. 

Output 


An analysis of variance table is produced giving the source, sums of squares, degrees of freedom, mean 
squares, F-ratios, and the probability levels of the F-ratios (probability of obtaining a higher F-value). 


Notes and Hints 

1) If your Latin square design or data, or both, are not stored in the computer or are not of the 
APL data forms required by LATINSQP, use LATINSQC. LATINSQC produces the same output 
as LATINSQP but requests the Latin square design and data conversationally (cf. 24779500 in 
this workspace). 


2) Тһе number of levels of each factor must be greater than 2 and less than 53. 


Example 


Taken from Steel and Torrie, pages 148-150. See References. 


DATA 
10.5 7.7 12 13:2 
11.1 12 10.3 7.5 
5.8 12.2 11.2 13.7 
11.6 12.3 5.9 10.2 
DESIGN 
CDBA 
BACD 
DCAB 
ABDC 
DESIGN LATINSQP DATA 
ALIGN PAPER 
ANALYSIS OF VARIANCE -- LATIN SQUARE DESIGN 
SOURCE DF 55 MS F-RATIO PLEVEL 


ROWS 3 1.9550 0.6517 1.4375 0.3220 
COLUMNS 3 6.8000 2.2667 5.0000 0.0457 
TRIMNTS 3 78.9250 26.3083 58.0331 0.0002 
ERROR 6 2.7200 0.4533 

TOTAL 15 90.4000 

References 


Dixon, W.J. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, Toronto, 1957. 
Keeping, E.S. Introduction to Statistical Inference, D. Van Nostrand, Toronto, 1962. 

Steel, R.G. and J.H. Torrie. Principles and Procedures of Statistics, McGraw-Hill, Toronto, 1960. 
Source 

R. Hui 


LP. Sharp Associates Limited 
Calgary, Alberta 
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MEANSP 
Syntax 
MEANSP DATA 
Description 


MEANSP is a non-conversational function which produces means tables and/or profiles of factor interac- 
tions for an N factor experiment. 


Argument 


DATA - An array of the observations. The rank (poDATA) of DATA is the number of factors in the 
experiment, while the dimensions (pDATA) of DATA are the numbers of levels of the factors. 


Input 


The specifications of the profiles or tables to be produced by MEANSP must be specified before executing 
MEANSP by typing 


PROFILES 'XXX' 
or 
TABLES “ХХХ! 


XXX is a character vector designating factor combinations for which tables and/or profiles are to be 
produced, separated by commas. For an N factor experiment, the designations must be of one of three 
forms: 


1) ALL J, where J is an integer between 1 and N 
2) Any size J combinations of the first N letters of the alphabet. 


In the first form, ALL J indicates all J factor interactions; in the second form, each combination 
indicates a specific factor interaction. An example of valid profile specifications for a 4 factor experi- 
ment is PROFILES 'AB,'DCB, ALL u'. M you do not wish to have any profiles or tables produced, 
type PROFILES 'OFF'or TABLES 'OFF'. 

Output 


Tables and/or profiles appropriate to the specifications entered will be printed. 


283 


Example 


Taken from Steel and Torrie, pages 199-203. See References. 


DATA 

8.53 17.53 
39.14 32 

20.53 21.07 
26.2 23.8 
12.53 20.8 
31.33 28.87 
14 17.33 
45.8 25.06 
10.8 20.07 
40.2 29.33 


PROFILES "ВС! 
WAS 'OFF' 

TABLES 'BC' 
WAS 'OFF' 


MEANSP DATA 


53.28 * 


43.28 + 
х 
33.28 + 
* 
23.28 + 
o 
13.28 О----'----+----'----+----!----+----'----+ 
1 1 * 2 2 
LEVELS OF C 
LEGEWD: 
o B 
* B2 
Сі с? MEANS 
В1 13.2780 19.3600 16.3190 
B2 36.5340 27.8120 32.1730 
MEANS 24.9060 23.5860 24.2460 


Notes and Hints 


1) MEANSP does not perform any analysis of variance. To get analysis of variance as well as means 
tables and/or profiles, use the function NWAYANOVAC or NWAYANOVAP in this workspace. 


2) For an N factor experiment, a request for ап N factor means table results in output of DATA 
in a possibly different arrangement; a request for an N factor profile results in plots of the 
appropriate coordinate planes of DATA. 


3) Тһе function INPUTDATA in this workspace may be helpful in setting up the data array DATA 
for MEANSP. For a description of this function, see the first entry under "Input" in the documenta- 
tion for NWAYANOVAC. INPUTDATA is invoked by typing INPUTDATA, and constructs the result 
as the global variable DATA. 
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References 


Arner, S.L. Profile: A Computer Program for Displaying the Geometric Relationships of Re- 
sponses in a Factorial Design, U.S.D.A. Forest Service Paper NE-288, 1974. 


Steel, К.С. and Ј.Н. Torrie. Principles and Procedures of Statistics, McGraw-Hill, Toronto, 1960. 
Source 
R. Hui 


LP. Sharp Associates Limited 
Calgary, Alberta 
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NWAYANOVAC 


Syntax 


NWAY ANOV AC 


Description 


NWAYANOVAC is a conversational function which performs analysis of variance on an М factor ехрегі- 
ment with a blocked or completely random design, with or without missing values, with or without 
replications, and with a standard or non-standard error term. In addition, profiles and/or tables of 
treatment combination means may be produced if desired. 


Input 


1) 


2) 


3) 


4) 


5) 


The experimental data, which may be entered in one of two ways: 


a) И your data is already stored in the computer in the form of an array whose rank (pp) is 
the number of factors and whose dimensions (p) are the numbers of levels of the factors, then 
simply enter the name of the array. 


b) If, however, the above conditions are not satisfied, the subfunction INPUTDATA will be invoked 
by NWAYANOVAC to help you enter the data. INFUTDATA will first request you to enter the number 
of levels of each factor. Subsequently the experimental data are input by repeated entry of the 
levels of the last factor at combinations of all levels of the remaining factors. For example, for 
an experiment with 3 factors with 2, 2, and 3 levels respectively, the entries will be: 


Factor 1 Factor 2 Factor 3 
Level 1 Level 1 All Levels 
Level 1 Level 2 All Levels 
Level 2 Level 1 АП Levels 
Level 2 Level 2 АП Levels 


The result of the function INPUTDATA is the global variable DATA, which remains available after 
execution of NWAYANOVAC. 


Whether you have any missing values. If your response is YES, you will be asked 10 enter the 
value in the data which represents missing values. (Since arrays may not be “ragged” in APL, 
a dummy value must be used as a “place holder" for missing data in the design; clearly this must 
be some arbitrary value which never actually occurs in the raw data.) 


Whether the experiment is a blocked design or not. If your response is YES, you will be asked 
to enter the name of the factor to be treated as blocks (с.р., enter ‘At, "В", "С", etc.). 


Whether you have any replications that you do not wish to treat as a separate factor. If your 
response is YES, you will be asked to enter the name of the factor to be treated as replications 
(eg. enter 14%, 'B', 'C', etc). 


Whether you wish to have special effects included in the error term. (The standard error term 
consists of the highest order factor interaction.) If your response is YES, you will be asked to 
enter specifications for the error term. For the syntax of the specifications, see Notes and Hints 
1) and 2). 
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6) Whether you wish to have profiles of the treatment combination means. И your response is 
YES, you will be asked to enter the specifications for the profiles. For the syntax of the specifica- 
tions, see Notes and Hints 1) and 2). 


7) Whether you wish to have tables of the treatment combination means. И your response is YES, 
you will be asked to enter the specifications for the tables. For the syntax of the specifications, 
see Notes and Hints 1) and 2). 

Output 

An analysis of variance table is produced giving the source of variation, degrees of freedom, sums of 


squares, mean squares, F-ratios, and the probability levels of the F-ratios (probability of obtaining 
a higher F-value). Profiles and/or tables are also printed at this time if requested previously. 
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Example 


Taken from Steel and Torrie, pages 236-239. See References. 


DATA 

42.9 53.8 49.5 gu. 
53.3 57.6 59.8 64.1 
62.3 63.4 64.5 63.6 
75.4 70.3 68.8 71.6 
41.6 58.5 53.8 41.8 
69.6 69.6 65.8 57.4 
58.5 50.4 46.1 56.4 
65.6 67.3 65.3 69.4 
28.9 43.9 40.7 28.3 
45.4 42.4 41.4 44.1 
44.6 45 62.6 52.7 
54 57.6 45.6 56.6 
30.8 46.3 39.4 34.7 
35.1 51.9 45.4 51.6 
50.3 46.7 50.3 51.8 
52.7 58.5 51 87.4 
NWAYANOV AC 


ARE YOUR DATA STORED AND OF THE REQUIRED ARRAY SHAPE? 
YES 
ENTER THE DATA ARRAY 
D: 
DATA 
YOU HAVE 3 FACTORS: A B C WITH 4 4 4 LEVELS RESPECTIVELY. 
DO YOU HAVE ANY MISSING VALUES? 


NO 

IS YOUR EXPERIMENT A BLOCKED DESIGN? 

NO 

DO YOU HAVE ANY REPLICATIONS? 

NO 

DO YOU WANT ANY SPECIAL EFFECTS FOR THE ERROR TERM? 
YES 

ENTER THE SPECIAL EFFECTS SPECIFICATIOWS. 
BC 

DO YOU WANT ANY PROFILE ANALYSES? 

NO 

DO YOU WANT ANY MEANS TABLES? 

NO 

ALIGN PAPER 
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ANALYSIS OF VARIANCE 


SOURCE DF SS MS Р PLEVEL 


Notes and Hints 


1) 


Тһе factor names, used in the analysis of variance table as well as in other places, are letters 
of the alphabet. For example, if you have a 4 factor experiment, the factor names would be 
"АҒ, "В", "Ct, and "0". If you have designated "В! as a replication rather than a factor, 
you will then have a 3 factor experiment with factors "А", "В", and 'C', where "В" is the 
former 'C' and 'C' is the former 'D'. The factor names used in the specifications for 
the error term, profiles, and tables must refer to these new names. 


Specifications for the special effects, profiles, and tables are character vectors consisting of factor 
combinations, separated by commas. For an N factor experiment, the segments may be of the 
following forms: 


a) ALL J 
b) Any size J combinations of the first N letters of the alphabet (the factor names), 
where J is an integer between 1 and М. 


In the first form, each segment ALL J indicates all < factor interactions; in the second form, each 
segment indicates a specific factor interaction. An example of a valid specification for a 4 factor 
experiment is AB, DCB, ALL v. 


When entering the data, include any replications as a factor. Later, when queried about whether 
you have any replications, answer in the affirmative only if you do not wish to consider the 
replications as a separate factor. 


Estimates of the missing values will be calculated using the multiple covariance method, with 1 
degree of freedom subtracted from the total and error degrees of freedom for each value estimated. 


NWAY ANOVAP is а non-conversational version of NWAYANOVAC which produces the same analysis 


and optional features without the conversational input. See the documentation for WWAYANOVAP 
in this workspace. 
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References 


Arner, S.L. Profile: А Computer Program for Displaying the Geometric Relationships of Ке- 
sponses in a Factorial Design, U.S.D.A. Forest Service Paper NE-288, 1974. 


Steel, R.G. and J.H. Torrie. Principles and Procedures of Statistics, McGraw-Hill, Toronto, 1960. 
Source 
R. Hui 


I.P. Sharp Associates Limited 
Calgary, Alberta 
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NWAYANOVAP 
Syntax 
NWAYANOVAP DATA 
Description 


NWAYANOVAP is a non-conversational function that performs an analysis of variance on an N factor 
experiment with more than 1 level per factor. By executing one or more of the state setting functions, 
optional features such as estimation of missing values, analyses for blocked or replicated designs, profile 
analyses, and means tables may also be obtained. 


Argument 


DATA - An array of the observations. The rank (ppDATA) of DATA is the number of factors in the 
experiment, while the dimensions (pDATA) of DATA are the numbers of levels of the factors. 
К you have replications, include them as a factor in DATA. 


State Setting Functions 


Optional features for WWAYANOVAP can be obtained by changing the values of the state, The state is 
set by a group of state setting functions. Those functions, with the default values in square brackets, 
are: 


STATE 
Displays the current state. 


DEFAULT 
Restores the state settings to the default values. 


MISSING VALUES XXX (MISSING VALUES OFF] 
MISSING VALUES ОЕР indicates no estimation of missing values takes place. 
MISSING VALUES XXX indicates that XXX is a filler value in the data array signifying missing 
values. Estimates of the missing values will be calculated using the multiple covariance method, 
with 1 degree of freedom subtracted from the total and error degrees of freedom for each value 
estimated. 


BLOCKS 'XXX' [BLOCKS 'OFF'] 
BLOCKS 'OFF' indicates that the experiment is not a blocked design; BLOCKS 'XXX' indicates 
that factor XXX should be treated as blocks. In the second case the block sum of squares will be 
an entry in the analysis of variance table, but the interactions of the blocks with the other factors 
will be included in the error term. XXX should be 'A' if the first factor is the blocks, 'B' if the 
second, and so on. 


REPLICATIONS 'XXX' [REPLICATIONS 'OFF'] 
In the case where replications are not to be treated as a separate factor (that is, error due to 
replications and all its interactions are to be included in the error term), 
REPLIC NS "ХХХ! should be used. "ХХХ! is a character scalar or vector consisting of the 
letter of the alphabet indicating the factor to be treated as replications. For example, type 
REPLICAT. "В' if the second factor is to be treated as replications. 
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SPECIAL EFFECTS 'XXX' (SPECIAL EFFECTS 'OFF'] 
For SPECIAL EFFECTS 'OFF', the error term consists of the highest order factor interaction and 
appropriate terms if BLOCKS and/or REPLICATIONS are set. Under an optional setting the terms 
specified by XXX as well as the highest order factor interaction will be included in the error term. 
"ХХХ! is a character vector consisting of factor combinations, separated by commas. For an М 
factor experiment, the segments may be of the following forms: 
1) ALL J, where J is an integer between 1 and N. 
2) Any size J combinations of the first N letters of the alphabet (the factor names). 
In the first form each segment ALL J indicates all J factor interactions; in the second form each 
combination indicates a specific factor interaction. An example of valid special effects specifications 
for a 4 factor experiment is SPECIAL EFFECTS 'AB,DCB,ALL 4'. 


PROFILES 'XXX' [PROFILES 'OFF'] 
PROFILES 'XXX' indicates that profiles of those treatment combination means specified by 
"ХХХ! are to be produced. "ХХХ! has the same format as that for special effects. (See above.) 
TABLES 'XXX' (TABLES 'OFF'] 
TABLES 'XXX' indicates that tables of those treatment combination means specified by 'XXX' 
are to be produced. 'XXX' has the same format as that for special effects. (See above.) 
Output 
An analysis of variance table is produced giving the source of variation, degrees of freedom, sums of 
squares, mean squares, F-ratios, and the probability levels of the F-ratios (probability of obtaining 
a higher F-value). Profiles and/or tables of treatment means will also be printed at this time if 
requested. 
Example 
Taken from Steel and Torrie, pages 236-239. See References. 


(See next page.) 
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42.9 53.8 
53.3 57.6 
62.3 63.4 
75.4 70.3 
41.6 58.5 
69.6 69.6 
58.5 50.4 
65.6 67.3 
28.9 43.9 
45.4 42.4 
uu 45 

54 57.6 
30.8 46.3 
35.1 51.9 
50.3 46.7 
52.7 58.5 


SPECIAL EFFECTS 'BC' 
WAS 'OFF' 

TABLES 'ALL 1, AB' 
WAS 'OFF' 

STATE 
MISSING VALUES OFF 
BLOCKS 'OFF' 
REPLICATIONS 'OFF' 
SPECIAL EFFECTS 'BC' 
PROFILES 'OFF' 
TABLES 'ALL 1, AB' 


NWAYANOVAP DATA 
ALIGN PAPER 


49. 
59. 


65. 
68. 


оол 0 сл 


53. 
65. 
46. 
65. 


оњ о о 


40.7 
41.4 
62.6 
45.6 
39.4 
45.4 
50.3 
51 


ANALYSIS OF VARIANCE 


o 
Е 
сон в 


o 
RC 
roro 


щл 
ко 
Фа во 


ri 
= 
Foon 


28.1154 
28.1663 
1.6866 
2.0383 
0.3438 


SOURCE DP 55 MS 

А 3 2842.8731 947.6244 
B 3 2848.0219 949.3406 
с 3 170.5369 56.8456 
AB 9 618.2944 68.6994 
AC 9 104.2944 11.5883 
ERROR 36 1213.3737 33.7048 
TOTAL 63 7797.3944 


294 


Notes and Hints 


1) 


3) 


The printing of the profiles and/or tables may be interrupted Бу pressing the ATTENTION 
or BREAK key. 


The factor names, used in Ше analysis of variance table as well as in other places, are letters 
of the alphabet. For example, if you have a 4 factor experiment, the factor names would be 
5%, "В", "С", and . If you designate (say) 'B' as a replication, you will then have a 
3 factor experiment with factors ЗА", "В", and "С". where 'В' is the former "С" and 'C' is 
the former '5'. The factor names used in the specifications for the error term, profiles, and tables 
must refer to these new names. 


If you prefer a conversational style program, ANOVAC in this workspace should be used. This 
function is also appropriate if the data array is not already entered and is of the correct form. 
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References 


Arner, S.L. Profile: А Computer Program for Displaying the Geometric Relationships of Re- 
sponses in a Factorial Design, U.S.D.A. Forest Service Paper NE-288, 1974. 


Steel, К.С. and J.H. Torrie. Principles and Procedures of Statistics, McGraw-Hill, Toronto, 1960. 
Source 
R. Hui 


LP. Sharp Associates Limited 
Calgary, Alberta 
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SPLITPLOTC 
Syntax 
SPLITPLOTC DATA 
Description 


SPLITPLOTC performs an analysis of variance for a split plot or split-split plot design incorporating 
a completely random or а randomized complete block design. 


Argument 

РАТА - А 3-dimensional array (for split plot) or a 4-dimensional array (for split-split plot), with 
the first dimension representing blocks or replications, the second whole units, the third 
sub-units, and the fourth (if any) the sub-sub-units. 

Input 

You will be asked to enter the following conversationally: 

1) Whether you have any missing values or not. If yes, the “filler” value in the data will be replaced 


by estimates computed by the multiple covariance method 
2) Whether your experiment is a blocked design. Answer in the affirmative if you do no: wish the 


block. sum of squares to be ineluded in she error sum of squares for the whole units. 
Output 
An analysis of > produced. giving the source of variation, degrees of freedom, sums el 
square an sad the peeisibility levels of the Б-ғанох Гогов? 
а highe value} 
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Example 


Taken from Steel and Torrie, pages 236-239. See Reference. For program testing purposes two 
spurious missing values (0%) are introduced into the data. Compare with results in Steel and Torrie. 


DATA 
42.9 53.8 49.5 +. 
53.3 57.6 59.8 64.1 
62.3 63.4 64.5 0 
75.4 70.3 68.8 71.6 
41.6 58.5 53.8 41.8 
69.6 59.6 65.8 57.4 
58.5 50.4 46.1 56.1 
55.6 67.3 65.3 69.4 
28.9 83.9 49.7 28.3 
45.4 42.4 41.4 44.1 
44.6 45 62.6 52:7 
54 0 45.6 56.6 
30.8 46.3 39.4 34.7 
35.1 51.9 45.4 51.6 
50.3 46.7 50.3 51.8 
52.7 58.5 51 47. 
SPLITPLOTC ГАТА 

DO YOU HAVE ANY MISSING VALUES? 

YES 

WHICH VALUE REPRESENTS THE MISSING VALUES? 

ар 
0 

IS THIS A BLOCKED DESIGN? 

YES 
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ANALYSIS OF VARIANCE 


SOURCE DF 55 М5 F PLEVEL 
BLOCKS 3 2818.2311 939.4104 13.7492 0.0012 
A 3 2863.7639 954.5880 13.9744 0.0011 
ERROR A 9 614.9217 68.3246 


Notes and Hints 


The function IVPUTDATA in this workspace may be helpful in setting up the data array DATA for 
SPLITPLOTC. For a description of this function, see the first entry under "Input" in the documentation 
for NWAYANOVAC. INPUTDATA is invoked by typing INPUTDATA, and constructs the result as the global 
variable DATA. 

Reference 

Steel, К.С. and J.H. Torrie. Principles and Procedures of Statistics, McGraw-Hill, Toronto, 1960. 


Source 


R. Hui 
LP. Sharp Associates Limited 
Calgary, Alberta 
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34 COVARIANCE 


FUNCTIONS 
Function 
Header Documentation Description 
ANOCOVA K ANOCOVAHOW Performs an analysis of covariance for 
a single variable of classification with 
K populations. 
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ANOCOVA 
Syntax 
ANOCOVA K 
Description 


ANOCOVA performs an analysis of covariance on a single variable of classification with X populations. 
The K regression curves are tested for linearity and equality of slopes. The hypothesis that the 
populations have the same Y means, after adjustment for X values, is also tested. 


Argument 
K - The number of sample populations to be entered. 
Input 


1) You are requested to enter the data for X (the control variable or covariate) and then for Y, for 
each of the X populations. 

2) You must also enter the size of the critical region (alpha) for which Bartlews test will be 
conducted. 


Output 


) The results of Bartlett’s test for homogeneity of variances. 

) Analysis of covariance table. 

) Tes for the hypothesis of separate but parallel lines. 

) Results of the test for a common line for the means. 

) Results of the test that the common line is the overall regression line. 
) Results of the test for equality in adjusted Y means. 

7) Results of the test for a common regression line for all observations. 
8) Equations for all lines. 

9) Computed slope (overall), average slope, estimated slope. 

10) Grand means of X and Y. 

11) Individual Y means and adjusted Y means. 
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Example 


ANOCOVA 3 
ENTER X GROUP 1 
D: 


ENTER Y GROUP 1 
a: 


NTER X GROUP 2 
П: 

43 5 
ENTER Y GROUP 2 
0: 

12 12 10 13 
ENTER X CROUP 3 

123.1 
NTER Y GROUP З 


6587 


0 DETERMINE THE THEORETICAL JUSTIFICATIO 
VARIANCES,BARTLETT'S TEST FOR HOMOGENEITY i 
ТҮРЕ IN ALPHA 
0: 


POOLING THE 
FORMED. 


0.05 
THE HYPOTHESIS CF HOMOGE! 
THE PROBABILITY OF OCCUR 


ITY OF VARIANCES IS ACCEPTED 


THEIR REGRESSION LINE 9.844 1 9.844 
BETWEEN INDIVIDUAL SLOPES 0.360 2 0.180 
BETWEEN INDIVIDUAL LINES 10.932 6 1.822 


ABOUT THE OVERALL LINE 28.784 10 
DUE TO THE OVERALL LINE 42.882 1 
TOTAL 71.667 11 


TEST HYPOTHESIS OF SETARATE,BUT PARALLEL LINES 
THE HYPOTHESIS IS ACCEPTED AT THE 0.95 LEVEL 
THE PROBABILITY OF THE COMPUTED F VALUE OF 0.099 = 0.099 


TEST HYPOTHESIS OF ONE COMMON LINE FOR THE MEANS 


THE HYPOTHESIS IS REJECTED AT THE 0.95 LEVEL 
THE PROBABILITY OF THE COMPUTED F VALUE OF 6.975 - 0.971 
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TEST HYPOTHESIS THAT THIS COMMON LINE IS THE QVERALL REGRESSION LINE 


THE HYPOTHESIS IS REJECTED AT THE 0.95 LEVEL 
THE PROBABILITY OF THE COMPUTED Е VALUE OF 5.419 - 0.953 


IF THE HYPOTHESIS OF PARALLEL SM IS ACCEPTED,THE NEXT TEST IS VALID 


TEST THE HYPOTHESIS FOR EQUALITY IN ADJUSTED Y MEANS 
THE HYPOTHESIS IS REJECTED AT THE 0.95 LEVEL 
THE PROBABILITY OF TRE COMPUTED F VALUE OF 6.197 - 0.976 


TEST THE HYPOTHESIS FOR А COMMON REGRESSION LINE FOR ALL OBSERVATIONS 
THE HYPOTHESTS IS ACCEPTED AT THE 0.95 LEVEL 
THE PROBABILITY OF THE COMPUTED Е VALUE ОР 2.45 = 0.844 


THE EQUATION FOR LINE 1 IS 9.25+1(X- 2) 
THE EQUATION FOR LINE 2 IS 11.75%1(Х- 3.75) 
THE EQUATION FOR LINE 3 IS 6.5+0.545(Х- 1.75) 


THE COMPUTED OVERALL SLOPE,(E.G. FOR ALL OBSERVATIONS)= 3.282608696 
THE AVERAGE SLOPE, (WHEN LINES ARE PARALLEL )= 2.184210526 
THE ESTIMATED SLOPE, (QNE LINE FOR GROUP MEANS у= 0.8333333333 
THE GRAND MEAN OF X IS 2.5 
THE GRAND MEAN OF Y IS 9.166666667 
Y MEANS ADJUSTED Y MEANS 
9.25 9.666666667 
11.75 10.70833333 
6.5 7.125 


THESE COMPUTATIONS SHOULD BE CONSIDERED IN LIGHT OF ABOVE HYPOTHESES!! 


Notes and. Hints 
For analysis of variance, see the workspace 34 ANOVA. 
Reference 


Dixon, М.Ј. and F.J. Massey. Introduction to Statistical Analysis, McGraw-Hill, 1957, Chapter 
12. 
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LIBRARY 35 


MULTIVARIATE ANALYSIS 


Workspaces: DISCRIMINANT 


PRINCIPAL 
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35 DISCRIMINANT 


FUNCTIONS 
Function 
Header Documentation Description 
DISCRIM K DISCRIMHOW Performs discriminant analysis using a 
full set of discriminating variables. 
LEVELS STEPWISE K STEPWISEHOW Performs discriminant analysis using 


an optimal subset of discriminating var- 
iables determined by a stepwise process. 
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DISCRIM 


Syntax 

DISCRIM K 

Description 

The function DISCRIM executes a procedure which enables one or more multivariate units to be 
classified into one of several populations. The population assignment procedure is based upon а model 
of a multivariate normal distribution of units within populations. For each population, a sample of 
units which is already known to belong to that population is required. 

Argument 

K - the number of variables measured on each multivariate unit. 

State Setting Functions 

Certain optional features for DZSCAIM can be selected by changing the values of the state. The state 
is determined by a group of state setting functions, which are listed below (with their default values 


given in square brackets, where applicable). 


STATE 
Displays the current state. 


DEFAULT 
Returns the entire state to its default values. 


NUMPOPS NP [NUMPOPS 21 
Specifies the number of populations to be considered in the discriminant analysis. If there are 
more than 2 populations, this function must be invoked before DISCRIM is executed. 


PRIORPROBS PP LPRIORPROBS 1 1] 
Specifies the initial probabilities of an arbitrary unit’s belonging to each population. PP is a vector 
having the same number of elements as there are populations. The elements of РР must have 
one of the following three characteristics: 


1) all lie between (but exclusive of) 0 and 1, and thus represent the actual priors themselves; 


2) all equal 1, in which case prior probabilities will be ignored completely in the discriminant 
analysis which follows; 


3) all equal 0, in which case prior probabilities will be determined by the respective sample sizes 
of the populations. (For example, if from 2 populations, there were 6 observations sampled 
from population 1 and 4 observations sampled from population 2, then the prior probabilities 
would be calculated as 0.6 and 0.4, respectively.) 


Note: Whenever WUMPOPS NP is invoked, PRIORPROBS is automatically defined ав 
PRIORPROBS МРр1. 
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CALCPOST or NOCALCPOST [NOCALCPOST] 
Specifies whether or not posterior probabilities of belonging to each population are to be calculated 
and printed for the original sampled units from the given populations. These probabilities are 
affected by prior probabilities if the latter have been specified through PRIORPROBS. 


POOLED or UNPOOLED (POOLED) 
Specifies the manner in which covariance matrices аге to be estimated. If POGLED is used, then 
the covariance matrix is assumed to be the same in each population, in which case all samples 
will be pooled to provide one estimate of this matrix. If, however, UNPOOLED is selected, the 
covariance matrices for each population are all assumed to be different. In this case, the estimate 
of the covariance matrix for each population will be based only upon the sample of units from 
that population. 


MINERROR or CANONICAL LMINERFOR] 
Specifies which approach (minimizing the total error due to misclassification versus employing 
canonical variables) is to be adopted for the discriminant analysis. With the MINERROR approach, 
a unit is classified into the population for which the estimated probability density is a maximum. 
Under the CANONICAL approach, a unit is classified into the population to whose mean it is 
nearest in distance, after the within-population variation has first been minimized. More details 
on these two approaches are given in Lachenbruch, Chapter 5 (see References). 


EVALSCORES or EVALPROBS [EVALSCORES] 
Specifies whether an unknown unit or observation is to be classified into one of the populations 
by evaluating the "scores" (using coefficients determined from either the MINERRCR or 
CANONICAL procedure) or by evaluating Bayesian posterior probabilities of the unit's belonging 
to each population. More details on these two state values are given in Notes and Hints 6) and 
7) below. 


PLOT or NOFLOT ГЛОРГОТ) 
Specifies whether or not a plot of the first two canonical variables, evaluated at the original sample 
units, is to be printed. The state value PLOT is only relevant to the CANONTCAL procedure and 
is ignored if E has been selected. 


Input 


After the state has been set and DISCRIM has been invoked, a request will be made for a data matrix 
for population 1. The rows of this matrix represent the sample units from this population, while each 
column contains the values of one variable over all of the units. (There is one column for each 
variable.) Once this matrix has been entered, a request is made for a data matrix for population 2, 
and this process continues until such matrices have been entered for each population. These matrices 
may have different numbers of rows because sample sizes from each population need not be equal. 
However, each matrix must have the same number of columns, represenung the number of variables 
being used. 


Output 

Тһе following output is always provided: 

1) the population means and standard deviations for each variable; 

2) the pooled within-population correlation matrix; 

3) the test statistic, degrees of freedom, and significance level for the hypothesis of equal covariance 


matrices among the populations. (For more theoretical background concerning this statistic, see 
Box.) 
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If POOLED is in effect, a chi-square test statistic is printed, as are the corresponding degrees of freedom 
and the significance level for the hypothesis of equal population means between the given populations 
for every variable. (This chi-square statistic is equal to the generalized Mahalanobis D-Square 
Statistic.) If instead, UNPOOLED is the chosen option, the values of ~.5 times the inverse of each 
covariance matrix are printed. 


If MINERROR is selected, the coefficients of the discriminant functions are printed. On the other hand, 
if CANONICAL is the chosen state value, then the coefficients of the canonical variables (read column- 
wise) are output, along with the dispersion explained by each canonical variable, the cumulative 
percentage dispersion explained by each canonical variable, the canonical correlations, and the coeffi- 
cients of the canonical variables evaluated at the population means. 


All of the original sample units are then classified according to either the MINERROR or 
CANONICAL procedure, depending on the state. Each misclassification is explicitly listed, and if 
CALCPOST is used, the posterior probabilities for the original units are also printed. In addition, two 
classification matrices - one regular and one normalized - are provided. (These matrices are defined 
in Press, pages 381-382, although he calls them Confusion matrices rather than classification ma- 
trices.) The rows of each classification matrix represent the known populations of the units, while 
the columns represent the populations into which these units are classified. 


Finally, if requested via PLOT, a scatter plot of the first two canonical variables, evaluated at the 
original units, is produced. 
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Example 


Fisher's Iris data, taken from Wilf, pages 116-118 (see References). 
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STATE 
NUMPOPS 2 
MINERROR 
POOLED 
PRIORPROBS 1.000000 1.000000 
NOCALCPOST 
EVALSCORES 
NOPLOT 
NUMPOPS 3 
WAS 2 
PRIORPROBS ARE NOW EQUAL TO 1 1 1 


DISCRIM 4 
ENTER THE DATA MATRIX FOR POPULATION 1 
Т; DISCIRIS1 
ENTER THE DATA MATRIX FOR POPULATION 2 
a DISCIRIS2 
ENTER THE DATA MATRIX FOR POPULATION 3 


D: 
DISCIRIS3 


THE POPULATIQU MEANS FOR EACH VARIABLE 
POPULATION 1 POPULATION 2 POPULATION 3 GRAND MEAR 


VARIABLE 1 5.006000 5.936000 6.588000 5.843333 
VARIABLE 2 3.428000 2.770000 2.974000 3.057333 
VARIABLE 3 1.462000 4,260000 5.552000 3.758000 
VARIABLE 4 0.246000 1.326000 2.026000 1.199333 


THE POPULATIQU STANDARD DEVIATIONS EQE EACH VARIABLE 
POPULATION 1 POPULATION 2 POPULATION 3 


VARIABLE 1 0.352490 0.516171 0.635880 
VARIABLE 2 0.379064 0.313798 0.322497 
VARIABLE 3 0.173664 0.469911 0.551895 
VARIABLE 4 0.105386 0.197753 0.274650 


ТЕЕ VITUIE-PQPULATIQN CORRELATION МАТЕА 
VARIABLE 1 VARIABLE 2 VARIABLE 3 VARIABLE 4 


VARIABLE 1 1.00000 0.53024 0.75616 0.35551 
VARIABLE 2 0.53024 1.00000 0.37792 0.47053 
VARIABLE 3 0.75616 0.37792 1.00000 0.4846 
VARIABLE 4 0.36451 0.47053 0.48446 1.00000 
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CHI-SQUARE TEST VALUE POR EQUALITY OF COVARIANCE MATRICES: 141.0575 
DEGREES ОР FREEDOM: 20 
SIGNIFICANCE LEVEL: 0.0000000 


CHI-SQUARE TEST VALUE FOR EQUALITY OF POPULATION MEANS FOR EACH VARIABLE 
(=GENERALIZED MAHALANOBIS D-SQUARE): 4774.1661 

DEGREES ОР FREEDOM: 8 

SIGNIFICANCE LEVEL: 0.0000000 


СОЕЕРІСІЕЕТ2 QE THE DISCRIMINANT EUNCTIONS 


POPULATION 1 POPULATION 2 POPULATION 3 


CONSTANT 785.209858 771.753995 7103.269708 
VARIABLE 1 28.545167 15.698209 12.445849 
VARIABLE 2 23.587870 7.072510 3.685280 
VARIABLE 3 716.%30639 5.211451 12.766545 
VARIABLE 4 717.398411 5.434229 21.079113 


POPULATION 2 OBSERVATION 21 HAS BEEN MISCLASSIFIED INTO POPULATION 3 
POPULATION 2 OBSERVATION 34 HAS BEEN MISCLASSIFIED INTO POPULATION 3 
POPULATION 3 OBSERVATION 34 HAS BEEN MISCLASSIPIED INTO POPULATION 2 


CLASSLEICATION MATEIL 
POPULATION 1 POPULATION 2 POPULATION 3 
POPULATION 1 50 0 0 
POPULATION 2 0 4B 2 
POPULATION 3 0 1 49 


NORMALIZED CLASGLIEICATIQU MATRIX 
POPULATION i POPULATION 2 POPULATION 3 


POPULATION 1 1.000000 0.000000 0.000000 
POPULATION 2 0.000000 0.960000 0.040000 
POPULATION 3 0.000000 0.020000 0.980000 


REMEMBER TO SAVE THIS WORKSPACE IF YOU DBSIRE 
IN THE FUTURE TO CLASSIFY MULTIVARIATE UNITS OR OBSERVATIONS 
OF UNKNOWN ORIGIN INTO ONE OF ТНЕ KNOWN POPULATIONS. 


Notes and Hints 


1) 


үл 


6) 


H it is decided during the data input phase to prevent DISCRIM from executing further, type 
QUIT and return the carriage. 


Missing values are allowed in each data matrix. For each missing value, enter | 23. The 
missing value for a variable from a unit in a population will be replaced by the mean of the 
variable over all other existing units in the population. The resulting loss in degrees of freedom 
will be reflected in the estimate of the covariance matrix using that population’s data. 


The СА 


ICAL procedure ignores prior probabilities and automatically uses the state settings 
NOCALCPC. E 


Under the г 
the constant terms of the discriminant functions and the posterior probabil 
latter are requested via CALCPOST. 


procedure, if prior probabilities are specified or calculated, they affect only 
es, provided that the 


PETER 


The function БІМ creates a temporary file named 2525647 ( which stores the data 
matrices for all the populations. This file is automatically erased after any successful execution 
of DISCRIM; should execution of the function not complete successfully, this file will remain in 
existence. Any subsequent successful execution of DZSCRIM will erase the file; otherwise, it must 


be erased directly by the user. 


If the MINERROR approach is used. OCON and DVA? are generated as global variables. И 
UNFOCLED is also in effect, then DESCE becomes a third global variable. Referring 10 the output, 
the first row of numbers under the heading “Coefficients of the Discriminant Functions" is the 
vector stored іп DCON; the remaining rows under this heading form the transpose of the matrix 
DVAR. With the LED option, the values of 7.5 times the inverse of each covariance matrix 
are stored in DSSCP; DSSCP is thus а three-dimensional array, each plane corresponding 10 a 
population. 


If instead, the CANONICAL approach is followed, there are just two global variables produce 
CANCON and EVECTORS. The matrix under the heading “Coefficients of Canonical Variables" in 
the output is stored in EVECTORS, while CANCON contains the coefficients of the canonical variables 
evaluated at the population means. (However, although all canonical variables are given, only 
those corresponding to a non-zero dispersion are meaningful for classification purposes.) 


In either case, these global variables are used to classify unknown units into one of the popula- 
tions. Such a classification of unknown units can also be determined through this workspace by 
executing the function EVALUATE subsequent to an execution of the functi А request 
will be made to enter the unknown units; they should be entered as a matrix with one unit per 
TOW 


Then, if EVALSCORES is the chosen state, the classification scores, calculated as in Lachenbruch, 
Chapter 5, will be printed for each unknown unit, as will the maximum score and the population 
cori responding 10 that maximum score. Alternatively, if E is in effect. then the posterior 
probability of cach unknown unit’s belonging to each ров will be printed, along with the 
maximum probability and the population corresponding to that maximum probability. These 
probabilities are obtained using a Bayesian approach and are discussed in Press, pages 375-376 
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7) If EVALPROBS is chosen, one additional global variable is generated. It is called 
package of variables which are required for the calculation of posterior probab Since the 
global variables mentioned above in Note 6) will also exist, the user is able to classify unknown 
units using both the EVALPRCBS and EVALSCORES approaches. However, should it ever be 
necessary to exa hese posterior probabilities, then the state function EVAL. must be 
invoked before is executed. Also, these posterior probabilities can only be calculated if 
the number of sampled units from each population exceeds X (the number of variables) + 


References 


Box, G.E.P. “A General Distribution Theory for a Class of Likelihood Criteria", Biometrika, 1949. 
pp. 317-346 


Lachenbruch, Р.А. Discriminant Analysis, Hafner Press, New York, 1975 
Press, S.J. Applied Multivariate Analysis, Holt Press, San Francisco, 1972. 


Wilf, H.S. “А Method of Coalitions in Statistical Discriminant Analysis", Statistical Methods for 
Digital Computers, edited by Enslein, Ralston and Wilf, John Wiley and Sons Ine., New York, 1977. 


Source 
A. MacLeod 


LP. Sharp Associates Limited 
Ottawa, Ontario 
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STEPWISE 


Syntax 
LEVELS STEPWISE K 
Description 


The function STEPWISE executes a procedure which enables one or more multivariate units to be 
classified into one of several populations. The variables which are used in this procedure are chosen 
in a stepwise manner. At each step, either the variable which adds the most to the separation of the 
populations is entered into (or the variable that adds the least is removed from) the optimal set of 
variables required for classification. The population assignment procedure assumes a model of a 
multivariate normal distribution of units within populations, where each population has the same 
covariance matrix but a different vector of means. For each population, a sample of units which is 
already known to belong to that population is required. 


Arguments 


LEVELS - A vector with three elements, each between 0 and 1. At any step, the variable with the 
largest F-to-enter value is included in the optimal set of variables if the corresponding lower 
F-probability does not exceed the first element of LEVELS. The variable with the smallest 
F-to-remove value is deleted from the optimal set of variables if the corresponding lower 
F-probability is greater than the second element of LEVELS. The third element of LEVELS is 
called the tolerance threshold; any unentered variable whose tolerance (defined as 1 minus 
the square of the variable’s within-group multiple correlation with the currently entered 
variables) is less than the tolerance threshold at any step is not allowed to enter the optimal 
set of variables at that step. The argument LEVELS need not be explicitly stated; if it is not 
specified, the default values of 0.05, 0.05 and 0.0001, respectively, are assumed. 


K - Theinitial number of variables measured on each multivariate unit. An optimal subset of these 
variables, determined by STEPWISE, will be used in the discriminant analysis. This argument 
must be specified. 


State Setting Functions 
Certain optional features of STEPWISE can be selected by changing the values of the state. The state 
is determined by a group of state setting functions, which are listed below (with their default values 


given in square brackets, where applicable). 


STATE 
Displays the current state. 


DEFAULT 
Returns the entire state to its default values. 


NUMPOPS NP LNUMPOPS 2] 


Specifies the number of populations to be considered in the discriminant analysis. If there are 
more than two populations, this function must be invoked before STEPWISE is executed. 
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PRIORPROBS PP [PRIORPROBS 1 1] 
Specifies the initial probabilities of an arbitrary unit's belonging to cach population. PP is a vector 
having the same number of elements as there are populations. The elements of PP must have 
one of the following three characteristics: 


1) all lie between (but exclusive of) 0 and 1, and thus represent the actual priors themselves; 

2) all equal 1, in which case prior probabilities will be ignored completely in the discriminant 
analysis which follows; 

3) all equal 0, in which case prior probabililties will be determined by the respective sample sizes 
of the populations. (For example, if from 2 populations, there were 7 observations sampled 
from population 1 and 3 observations sampled from population 2, then the prior probabilities 
would be calculated as 0.7 and 0.3, respectively.) 


Note: Whenever NUMPOPS NP is invoked, PRIORPROBS is automatically invoked аз 
PRIORPROBS NPp1. 


CALCPOST or NOCALCPOST [NOCALCPOST ] 
Specifies whether or not posterior probabilities of belonging to each population are to be calculated 
and printed for the original sample units from the given populations. These probabilities are 
affected by prior probabilities if the latter have been specified through PRICRPROBS. 


MINERROR or CANONICAL [CANONICAL] 
Specifies which approach (minimizing the total error due to misclassification versus employing 
canonical variables) is to be adopted for the discriminant analysis. With the MINEREOR approach, 
a unit is classified into the population for which the estimated probability density is a maximum. 
Under the CANONICAL approach, a unit is classified into the population to whose mean it is 
nearest in distance, after the within-population variation has first been minimized. More details 
on these two approaches are given in Lachenbruch, Chapter 5. (See Reference.) 


PLOT or NOPLOT [ОРОТ] 
Specifies whether or not a plot of the first two canonical variables, evaluated at the original sample 
units, is 10 be printed. The state value PLO? is only relevant to the CANONICAL procedure and 
is ignored if MINERROR has been selected. 


Input 


After the state has been set and STEPWISE has been invoked, a request will be made for a data matrix 
for population 1. The rows of this matrix represent the sample units from this population, while each 
column contains the values of one variable over all of the units. (There is one column for each 
variable.) Once this matrix has been entcred, a request is made for a data matrix form population 
2. This process continues until such matrices have been entered for each populaton. These matrices 
may have different numbers of rows because sample sizes from each population need not be equal. 
However, each matrix must have the same number of columns, representing the initial number of 
variables, К, being considered. 


Output 


Initial output items include the population means and standard deviations for each variable, as well 
as the pooled within-population correlation matrix. This is followed by the output for step 0, which 
is the step in which no variables have yet been entered into the optimal set. These variables's F-to-enter 
values and the degrees of freedom are printed here. 
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At each subsequent step, the F-to-remove values are printed for all variables included in the optimal 
set and the F-to-enter values are printed for all variables not yet included in this set. In addition, 
at each step, Wilks’ lambda statistic (and an approximate Е statistic which is a transformation of 
the Wilks statistic) are given; these statistics test the equality of population means for the variables 
included in the optimal set. Finally, ап F-matrix for testing the equality of each pair of population 
means is output; this test is based only on the variables included in the optimal set. 


Stepping stops when no unentered variable has a large enough tolerance or a large enough F-to-enter 
value; an appropriate message is then printed. 


If MIWERROR is selected, the coefficients of the discriminant functions are printed next. On the other 
hand, if CANONICAL is the chosen state value, then the coefficients of the canonical variables (read 
column-wise) are output, along with the dispersion explained by each canonical variable, the cumula- 
tive percentage dispersion explained by each canonical variable, the canonical correlations, and the 
coefficients of the canonical vvriables evaulated at the population means. 


All of the original sampled units are then classified according to the MINERROR or CANONICAL 
procedure, depending on the state. Each misclassification is explicitly listed, and if CALCPOST is used, 
the posterior probabilities for the original units are also printed. In addition, two classification matrices 
-- one regular and one normalized -- are provided. (These matrices are defined in Press, pages 
381-382, although he calls them Confusion rather than classification matrices.) The rows of each 
classification matrix represent the known populations of the units, while the columns represent the 
populations into which these units are classified. 


Finally, if requested via PLOT, a scauer plot of the first two canonical variables, evaluated at the 
original units, is produced. 
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Example 


Fisher's Iris data, taken from Wilf, pages 116-118. (See References.) The data is displayed in the 
example of the function DISCRIM in this workspace. 


STATE 
NUNPOPS 2 
MINERROR 
POOLED 
PRIORPROBS 1.000000 1.000000 
BOCALCPOST 
EVALSCORES 
NOPLOT 
NUMPOPS 3 
WAS 2 
PRIORPROBS ARE WOW EQUAL TO i 1 1 
CANONICAL 
WAS MINBRROR 
PLOT 
WAS NOPLOT 


108 .0S .0002 STEPWISE 4 


ENTER TRE DATA NATRIX РОВ POPULATION 1 
0: 
DISCIRIS1 


ENTER TEE DATA MATRIX POR POPULATION 2 
0: 
DISCIRIS2 


ESTER TEE DATA MATRIX FOR POPULATION 3 
D: 
DISCIRIS3 


THE EQRULATIO“ 44445 208 EACE (4214825 
POPULATION 1 POPULATION 2 POPULATION З GFAND MEAN 


VARIABLE 1 5.006000 5.936000 6.588000 5.843333 
VARIABLE 2 3.428000 2.770000 2.974000 3.057333 
VARIABLE 3 1.462000 5.260000 5.552000 3.758000 
VARIABLE & 0.256000 1.326000 2.026000 1.199333 


THE 2020041104 STARDAED DEVIATIONS EQE EACH VARIABLE 
POPULATION 1 POPULATION 2 POPULATION 3 


VARIABLE 1 0.352490 0.516171 9.635880 
VARIABLE 2 0.379054 0.313798 0.322497 
VARIABLE 3 9.173668 0.469911 0.551895 
VARIABLE 4 0.105386 0.197753 0.274650 


THE KITEIR-POPULATIOR CORRELATION MALEII 
VARIABLE 1 VARIABLE 2 VARIABLE 3 VARIABLE 4 


VARIABLE 1 1.00000 -$3024 „75616 136461 
VARIABLE 2 $3024 1.00000 137192 -47053 
VARIABLE 3 175616 237792 1.00000 ЕТІТІЗ 
VARIABLE 4 .364$1 -47053 ETT 1.00000 


P-TO-ENTER LEVEL: 0.05 
P-TO-REMOVE LEVEL: 0.05 
TOLERANCE THRESHOLD: 0.0002 


STEP NUMBER 0 
VARIABLES NOP? ENTERED WITH F-TO-ENTRR STATISTICS (D.F. + 2 147) 


1 2 3 4 
119.2645 49.1600 1180.1612 960.0071 
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STEP NUMBER 1 
VARIABLE ENTERED: 3 


VARIABLES ENTERED WITH F-TO-REMOVE STATISTICS (D.F. = 2 147) 
3 
1180.1612 


VARIABLES NOT ENTERED WITH Р-Т0-ЕНТЕР STATISTICS (D.F. = 2 146) 


1 2 4 
Зу, 3231 43.0355 24.7657 
WILKS'-LAMBDA STATISTIC .058628 


DEGREES OF PREEDOM 1 2 147 


APPROXIMATE F STATISTIC 1180.161182 
DEGREES OF FREEDOM: 2 147 


P-MATRIX FOR TESTING EQUALITY OF EACH PAIR OF POPULATION MEANS 
DEGREES OF FREEDOM: 1 147 


POPULATION 2 POPULATION 3 


POPULATION 1 1056.873873 2258.262161 
POPULATION 2 225.3%7513 


STEP NUMBER 2 
VARIABLE ENTERED: 2 


VARIABLES ENTERED WITH F-TO-REMOVE STATISTICS (D.F. - 2 196) 


2 3 
43.0355 1112.9538 
VARIABLES NOT ENTERED WITH Р-Т0-ЕНТЕР STATISTICS (D.F. = 2 145) 
» 


1 
12.2685 34.5687 


WILKS'-LAMBDA STATISTIC .03688u 
DEGREES OF ЕРЕЕРОМ 2 2 147 


APPROXIMATE P STATISTIC 307.104665 
DEGREES ОР FREEDOM: 4 292 


F-MATRIX FOR TESTING EQUALITY OF EACH PAIR OF POPULATION MEANS 


DEGREES OF FREEDOM: 2 196 
POPULATION 2 POPULATION 3 
POPULATION 1 804.510968 1%73.231185 
POPULATION ? 116.038544 
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STEP NUMBER 3 
VARIABLE ENTERED: 4 


VARIABLES ENTERED WITH F-TO-REMOVE STATISTICS (D.F. = 2 145) 
2 E 4 
54.5769 38.7205 35.5687 
VARIABLES NOT ENTERED WITH F-TO-ENTER STATISTICS (D.P. = 2 144) 
1 
4.7212 
VILKS'-LAMBDA STATISTIC „024876 
DEGREES OP FREEDOM з 2 147 


APPROXIMATE F STATISTIC 257.503170 
DEGREES OF FREEDOM: 6 290 


F-MATRIX POR TESTING EQUALITY OF EACH PAIR OF POPULATION MEANS 


DEGREES OF FREEDOM: 3 195 
POPULATION 2 POPULATION 3 
POPULATION 1 692.014548 1381.162882 
POPULATION 2 133.373425 


STEP NUMBER 4 
VARIABLE ENTERED: 1 


VARIABLES ENTERED WITH F-TO-REMOVE STATISTICS (D.F. = 2 144) 


1 2 3 4 
4.7212 21.9359 35.5902 24.9043 
WILKS'-LAMBDA STATISTIC .023%39 
DEGREES OF FREEDOM 4 2 197 


APPROXIMATE F STATISTIC 199.145344 
DEGREES OF PREEDOM: 8 288 


F-MATRIX FOF TESTING EQUALITY OF EACH PAIR OF POPULATION MEANS 


DEGREES OP FREEDOM: “ 1% 
POPULATION 2 POPULATION 3 
POPULATION 1 550.188891 1098.273750 
POPULATION 2 105.312652 


STEPPING STOPS BECAUSE NO TOLERANCE IS ABOVE THE TOLERANCE THRESHOLD 
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COEFFICIENTS QE CARQUICAL VARIABLEG 
VARIABLE 1 VARIABLE 2 VARIABLE 3 VARIABLE % 


VARIABLE 1 06801 ‚00199 .26152 7.02627 
VARIABLE 2 .12656 .17853 .18632 7.11927 
VARIABLE 3 7.18155 7.07686 .20056 7.14667 
VARIABLE 4 7.23180 .23817 7.18090 .3299% 


LISPERSIQN EXPLAINED BY EACH CANONICAL VARIABLE 


1 2 3 4 
32.1919 +2854 -0000 .0000 


CUMULATIVE PERCENTAGE QISPERSION EIPLAINED BY EACH CANONICAL VAEIABLE 


1 2 3 4 

+9912 1.0000 1.0000 1.0000 
CANQNICAL CORRELATIONS 

1 2 3 4 

+9848 18712 .0000 .0000 


CANONICAL VARIABLES EVALUATED АТ COPULATION MEARS 
VARIABLE 1 VARIABLE 2 VARIABLE З VARIABLE 8 


POPULATION 1 .5538u .56717 7.42174 7.67365 
POPULATION 2 7.32015 .48939 7.42178 7.67365 
POPULATION 8 7.65056 .59172 242174 7.67365 


POPULATION 2 OBSERVATION 21 HAS BEEN MISCLASSIFIED INTO POPULATION 3 
POPULATION 2 OBSERVATION 34 HAS BEEN MISCLASSIFIED INTO POPULATION 3 
POPULATION 3 OBSERVATION 3% HAS BEEN MISCLASSIFIED INTO POPULATION 2 


CLASSIFICATION MATELY 
POPULATION 1 POPULATION 2 POPULATION 3 
0 


POPULATION 1 50 0 
POPULATION 2 0 48 2 
POPULATION 3 0 1 49 


NORMALIZED CLASSLELCATION мате 
POPULATION i POPULATION 2 POPULATION 3 


POPULATION 1 1.000000 0.000000 0.000000 
POPULATION 2 0.000000 0.960000 0.050000 
POPULATION 3 0.000000 0.020000 0.980000 
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PLOT ОР THE EIEST TEQ CABQRICAL LARIAELEG 
0.2000» 


9.3000» 
М 22 
0.4000» 1 
+ 2 
+ 3 йа 
2.5000» a 2 ina 
H 1 
+ 1а 
* 3 1111 
+ n 
+ ел 
9.6000» 111 1 
* 11111 
+ 11 
+ з 1 
И 1141? 1 
+ 1 
0.7000% 1 
+ 3 14 
+ з з 
+ a 
+ 33 3 
+ 3 
0. 6000+ 1 


(ө) IS THE SYMBOL FOR THE FIRST TWO CANONICAL VARIABLES EVALUATED AT THE POPULATION MEANS. 
(0) IS THE SYMBOL FOR ARY SITUATION ІН WHICH TWO OF МОКЕ UNITS PROM ПЕРЕВЕНТ POPULATIONS INTEPSECT. 


REMEMBER TO SAVE THIS WORKSPACE IP YOU DESIRE 
IN THE PUTURE TO CLASSIPI MULTIVAFIATE UNITS OR OBSERVATIONS 
ОР UNKNOWN ORIGIN INTO ONE OF THE KNOWN POPULATIONS. 
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Notes and Hints 


1) 


2) 


3) 


4) 


5) 


8) 


Two of the state options allowed with the function DISCRIM in this workspace are not allowed 
with STEPWISE; they are POOLED vs. UNPOOLED апа EVALSCORES vs. EVALPROBS. Only 
the default values POOLED and EVALSCORES are relevant to and recognized by STEPWISE. 


Тһе CANONICAL procedure ignores prior probabilities and automatically uses the state settings 
NOCALCPOST and POOLED. 


Under the MINERROR procedure, if prior probabilities are specified or caiculated, they affect only 
the constant terms of the discriminant functions and the posterior probabilities of the original 
units, provided that the latter are requested via CALCPOST. 


If it is decided during the data input phase to prevent STEPWISE from executing further, type 
QUIT and return the carriage. 


Missing values are allowed in each data matrix. For each missing value, enter 7999.99. The 
missing value for a variable from a unit in a population will be replaced by the mean of the 
variable over all other existing units in the population. The resulting loss in degrees of freedom 
will be reflected in the estimate of the covariance matrix using that population’s data. 


A discussion of the underlying theory behind the selection of the optimal set of variables in a 
stepwise manner may be found in Jennrich (see References). 


matrices for all the populations. This file is automatically erased after any successful execution 
of STEPWISE; should execution of the function not complete successfully, this file will remain in 
existence. Any subsequent successful execution of SPEPWISE will erase the file; otherwise, it must 
be erased directly by the user. 


If the MINERROR approach is used, DCCN and DVAR are generated as global variables. The first 
row of numbers under the heading “Coefficients of the Discriminant Functions” in the output 
is the vector stored in DCON; the remaining rows under this heading form the transpose of the 
matrix DVAR. 


If instead, the CANONICAL approach is followed, two different global variables are produced: 
CANCON and EVECTORS. The matrix in the output under the heading “Coefficients of Canonical 
Variables” is stored in EVECTORS, while CANCON contains the coefficients of the canonical vari- 


ables evaluated at the population means. (However, although all canonical variables are included, 
only those corresponding to a non-zero dispersion are meaningful for classification purposes.) 


Tn either case, these global variables are used to classify unknown units into one of the popula- 
tions. Such classification of unknown units can also be determined through this workspace by 
executing the function EVALUATE subsequent to an execution of the function STEPWISE. A request 
will be made to enter the unknown units; they should be entered as a matrix with one unit per 
row. The classification scores, calcuiated as in Lachenbruch, Chapter 5, will be printed for each 
unknown unit, as will the maximum score and the population corresponding to this maximum 
score. 
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Function 


Header 
CANONICALCOR K 
PRINCIPAL K 


35 PRINCIPAL 


FUNCTIONS 
Documentation Description 
CANONICALCORHOW Perform canonical correlation analysis, 
PRINCIPALHOW Performs principal component analysis. 


CANONICALCOR 
Syntax 


CANONICALCOR K 
Description 


This function performs a canonical correlation analysis between two sets of variables. An input matrix 
of multivariate observations is required for each variable set. CANONICALCOR produces the canonical 
correlations along with the coefficients of each corresponding pair of canonical variates. The program 
also contains options for estimating the correlations between the canonical variates and the original 
variables, for testing hypotheses relating to the canonical correlations, and for evaluating the canonical 
variates at the initial data observations. 


Argument 


K - a 2-element vector containing the number of variables in the two variable sets. Since these 
two numbers need not be equal, the smaller of the two should be provided first. 


State Setting Functions 


There are five optional features available with this program; these are selected by changing the values 
of the state. The state is determined by the following state setting functions (with default values in 
square brackets, where applicable): 


STATE 
Displays the current state. 


DEFAULT 
Returns the еліге state to its default values. 


COVARIANCE or CORRELATION [CORRELATION] 
Specifies whether the canonical correlation analysis is to be based on covariance matrices or 
correlation matrices, estimated both within the two variable sets and between these two sets. This 
option does not affect the actual canonical correlations but does influence the coefficients of the 
canonical variates. 


PRINTMATRIX or NOPRINTMATRIX CNOPRINTMATRIX] 
Specifies whether or not the covariance or correlation matrices (requested above) are to be 
explicitly printed before the results of the canonical correlation analysis are produced. 


COMPVARCORRNS ог NOCOMPV ARCORRNS LNOCOMPV ARCOREWS 3 
Specifies whether or not correlations are to be estimated (and printed) between the canonical 
variates and the original variables. The correlations will be calculated between each canonical 
variate and each variable upon which that variate is based, separately. No correlations are 
estimated between the original variables of one set and the canonical variates based on variables 
of the other set. 
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TESTHYPS or NOTESTHYPS LNOTESTHYPS] 
Specifies whether or not hypothesis tesis concerning the actual canonical correlations are to be 
performed. The number of tests carried out equals [1] (i.e., the first element in the argument 
to CANONICALCOR). The i'th hypothesis states that the i’th canonical correlation is equal to all 
subsequent canonical correlations, and that these correlations (starting with the i'th one) are all 
zero. Each of these hypotheses is tested by an approximate chi-square statistic, as described in 
Timm, pages 350-351 (see Reference). 


CANCORSCORES or NOCANCORSCORES ГЛОСАМСОЕЗСОВЕЗ 1 
Specifies whether or not the canonical variates are to be evaluated at the original data observa- 
tions, the resulting scores being printed. 


Input 


After the state has been set and CAVONICALCOR has been invoked, a request is made for the first data 
matrix, which should have n (the number of observations) rows and K[1] columns. Each row 
represents а sample muliivariate unit or observation while each column contains the values of one 
variable from the first variable set over ali of the units. Once this data matrix (which is based on 
the smaller variable set) has been properly entered, a request is made for the second data matrix. 
This second matrix has K[2] (the second element in the argument to CANONICALCOR) columns but 
must also have n rows; i.e, both data matrices must contain the same number of rows, with these 
rows referring to the same multivariate units. 


Output 


The estimates of the initial covariance or correlation matrices within and between the two variable 
sets are printed if so requested via PRINTMATRIX. (For the matrix between variable sets, the rows 
represent the first set and the columns the second set.) Next, there are ХГ11 blocks of output, each 
containing a canonical correlation, the coefficients of the corresponding pair of canonical variates, and 
the proportion of variation in the original variable sets which is common to the pair of canonical 
variates. The last of these output blocks is followed by an estimate of the mean square canonical 
correlation coefficient, defined as the average of the squares of the КГ 1] different canonical correlations. 


The remaining output depends on the values of the state. The output associated with 
COMPVARCORRNS, TESTHYPS and CANCORSCORES, respectively, is presented if any or all of these 
options are in effect. Note that with TESTHYPS, the chi-square test value, degrees of freedom and 
significance level are given for each hypothesis tested. Also, with CANCORSCORES, the canonical variates 
are evaluated at the input data after it has been first either centred (if COVARIANCE is selected) or 
standardized (if CORRELATION is in effect). However, the required centering or standardization is 
performed by the program; it need not be carried out by the user prior to input. 
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Example 
Taken from Timm, pages 314 and 356 (see Reference). 


%215,5Х1,315" OFMT(CCPSYCH1;CCPSYCH2) 


12 16 49 48 8 
30 27 47 76 13 
16 16 11 40 13 
17 8 9 52 9 
26 17 69 63 15 
8% 25 35 82 14 
23 18 5 71 21 
19 14 8 68 8 
16 13 49 7% 11 
26 25 8 70 15 
35 2% 47 70 15 
15 14 6 61 11 
27 21 1% 54 12 
20 17 30 55 13 
26 22 4 54 10 
14 8 24 40 14 
35 27 19 66 13 
14 16 45 5% 10 
27 26 22 6u 14 
18 10 16 47 16 
1% 18 32 “48 16 
26 26 37 52 14 
23 23 47 74 19 
11 8 5 57 12 
15 17 6 57 10 
28 21 60 80 11 
34 23 58 78 13 
23 11 6 70 16 
12 8 16 47 14 
32 32 45 9% 19 
25 14 9 63 11 
29 21 59 76 16 
23 24 35 59 11 
19 12 19 55 8 
18 18 58 74 14 
31 26 58 71 17 
15 14 79 54 14 
STATE 

CORRELATION 

NOPRINTMATRIX 

NOCOMPV ARCORRNS 

PRINCOMSCORES 

NOTESTHYPS 

NOCANCORSCORES 

TESTHYPS 


WAS NOTESTHYPS 
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CANONICALCOR 2 3 
ENTER DATA MATRIX 1 
D: 


CCPSYCH1 
ENTER DATA MATRIX 2 


0: 
CCPSYCH2 


CANQUICAL CORRELATIONS AUL CCEERICIENTS QE CANQNICAL VAGIATES 
CANONICAL CORRELATION 1: 0.688892 


PROPORTION OF VARIATION COMMON TO THIS PAIR OF CANONICAL VARIATES: 0.874573 
COEFFICIENTS OF CANONICAL VARIATE FOR FIRST SET OF ORIGINAL VARIABLES 


1 2 
0.7752 0.2662 
COEPPICIENTS OF CANONICAL VARIATE FOR SECOND SET OF ORIGIRAL VARIABLES 
1 2 3 
0.0520 0.8991 0.1831 
CANONICAL CORRELATION 2: 0.193591 


PROPORTION OF VARIATION COMMON TO THIS PAIR OF CANONICAL VARIATES: 0.037477 
COEFFICIENTS OF CANONICAL VARIATE FOR FIRST SET OF ORIGINAL VARIABLES 


ї 2 
71.5550 1.6274 
COEFFICIENTS OF CANONICAL VARIATE FOR SECOND SET OF ORIGINAL VARIABLES 
1 2 3 
0.9939 70.5912 0.3125 


THE MEAN SQUARE CANONICAL CORRELATION COEFFICIENT: 0.256025 


CHI-SQUARE TEST STATISTICS Ғ08 EQUALITI QE CORBELATIQU COEEPICIEUTS 10 ZERO 


STARTING FROM CORRELATION COBFFICIENT 1 
CHI-SQUARE TEST STATISTIC: 22.497 


DEGREES OF FREEDOM: 6 
SIGNIFICANCE LEVEL: 0.0009835 


STARTING FROM CORRELATION COEFFICIENT 2 
CHI-SQUARE TEST STATISTIC: 1.261 


DEGREES ОР FREEDOM: 2 
SIGNIFICANCE LEVEL: 0.5324530 


REMEMBER TO SAVE THIS WORKSPACE IF YOU WISH TO LATER 
USE THE COEFFICIENTS OF THE CAWONICAL VARIATES 
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Notes and Hints 


1) 


2) 


4) 


5) 


If it is decided during the data input phase to prevent the program from executing further, type 
QUIT and return the carriage. 


Missing values are allowed in the initial data matrices. For each missing value, enter 999.99. 
The missing value for a variable from a unit will be replaced by the mean of that variable over 
all existing units. The resulting loss in degrees of freedom will be reflected in the estimates of 
the initial covariance or correlation matrices upon which the canonical correlation analysis is 
based. 


The state setting PRINCOMSCORES/PRINCOMRANKS is not relevant to the canonical correlation 
analysis; it applies only to the principal component analysis which may also be performed in this 
workspace. 


Two global variables are generated by CANONICALCOR; they are called CCEA and CCFB. CCFA 
is a matrix containing the coefficients of the canonical variates based on the variable set with 
K[1] variables. Each column of CCFA corresponds to one canonical variate. Similarly, ССЕВ is 
a matrix containing the coefficients of the canonical variates based on the second variable set. 


A discussion of the theory fundamental to a canonical correlation analysis is provided in Timm, 
Sections 4.14 and 4.15 (see Reference). 


Reference 


Timm, N.H. Multivariate Analysis with Applications in Education and Psychology, Brooks-Cole 
Publishing Co., Monterey, California, 1975. 


Source 


A. MacLeod 
LP. Sharp Associates Limited 
Ottawa, Ontario 
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PRINCIPAL 
Syntax 
PRINCIPAL K 
Description 
"This program performs a principal component analysis on a set of input data and determines either 
the actual value or the relative ranking by value of each principal component evaluated at each data 
observation. 
Argument 
K  - The number of variables measured on each multivariate unit. 
State Setting Functions 
There are four optional features available with this program; these are selected by changing the values 
of the state. The state is determined by the following state setting functions (with the defaults given 


in square brackets, where applicable): 


STATE 
Displays the current state. 


DEFAULT 
Returns the entire state to its default values. 


COVARIANCE or CORRELATION [CORRELATION] 
Specifies whether the principal component analysis is tọ be based on the covariance matrix or 
the correlation matrix, either of which is estimated from the given data. Note that the results 
of the analysis based on the covariance matrix generally bear no mathematical relationship 
whatsoever to those based on the correlation matrix. 


PRINTMATRIX or NOPRINTMATRIX ENOPRINTMATRIX] 
Specifies whether or not the covariance or correlation matrix of the original data is to be printed 
before the results of the principal component analysis are produced. 

PRINCOMSCORES or PRINCOMRANKS LPRINCOMSCORES] 
Specifies whether the actual values (scores) of each principal component evaluated at the original 
data observations are to be printed or whether these scores are to be ranked in descending order, 
separately for each component, the ranks being printed instead. 


COMPVARCORRNS or МОСОМРУ ARCORRNS LNOCOMPV ARCORRNS] 
Specifies whether or not the following two tables are to be calculated and output: 


(a) The correlations between the principal components and the original variables; 


(b) The proportions of the variance of each variable accounted for by each principal component. 


330 
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Input 


After the state is set and PRINCIPAL has been invoked, a request is made for ап М (number of 
observations) by K data matrix. The rows of this matrix represent the sample multivariate units or 
observations, while each column contains the values of one variable over all of the units (with one 
column per variable). Although the data is centred or standardized for the analysis, these adjustments 
are performed by the program so that the data may be entered here in its raw, original form. 


Output 


The estimate of the initial covariance or correlation matrix is printed first if it has been requested 
via PRINTMATRIX. The next result produced is a matrix whose four columns contain: 


(a) The component number; 

(b) The eigenvalue associated with that component; 

(c) The percentage contribution of the component to total variability (in the original variables); 
(d) The cumulative percentage contribution of the component to total variability. 


This matrix is followed by the К by K matrix of weights (eigenvectors) for the components (each 
column corresponding to a component), and an N by K matrix containing either the principal 
component scores or ranks. These scores or ranks are based on the original data which is first centred 
(if COVARIANCE is selected) or standardized (if CORRELATION is chosen). Often, some of the latter 
columns of these component scores consist entirely of zeros. These all-zero columns are not listed in 
the output, but the number of such columns that exist is stated. 


Finally, the two tables described above under the state setting function COMPVARCORRNS are printed 
(if they have been requested). 
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Example 


Taken from Kendall, page 20 (see References). 


PCSOIL 
77.3 1 
82.5 1 
66.9 2 
87.2 3 
65.3 2 
83.3 1 
81.6 1 
87.8 3 
48.6 3 
61.6 2 
58.6 2 
69.3 2 
$1.8 3 
57.7 2 
57.2 3 
57.2 2 
59.2 3 
80.2 1 
82.2 1 
89.7 2 
STATE 
CORRELATION 
NOPRINTMATRIX 
NOCOMPV ARCORRNS 
PRINCOMSCORES 
NOTESTHYPS 
NOCANCORSCORES 
COVARIANCE 
WAS CORRELATION 
PRINTMATRIX 
WAS NOPRINTMATRIX 
COMPV ARCORRBS 


WAS NOCOMPVARCORRNS 


PRINCIPAL 5 
ENTER THE DATA MATRIX 
0: 


PCSOIL 


THE ZAMPLE COVAELANCE 


VARIABLE 
VARIABLE 
VARIABLE 
VARIABLE 
VARIABLE 


arune 


CQUEQREUT Е 
1 22 


oF en 


чата 
VARIABLE 


1 


138.3267 
7102.1227 
36.2060 
10.9422 
70.1358 


IGENVALUE 
3.8414253 
8.2181095 
0.4724624 
0. 2584501 
0. 0000000 


27 1.5 
5 1.5 
45 2.3 
2.8 
.2 1.9 
7 2.2 
Ei 2.9 
E 2.3 
E 2.1 
.3 1.9 
.3 2.4 
4 4 
4 2.1 
4.8 
-6 2.9 
Vi ЕРЕ 
-6 2.% 
-6 2 
sf 2.2 
.6 3.1 
VARIABLE 2 VARIABLE 3 
102.1227 736.2040 
79.7382 22.3856 
22.3846 13,8194 
1.5266 70.5854 
0.1108 0.0250 
PERCENTAGE 


CONTRIBUTION OF 
COMPONENT 70 
TOTAL VARIABILITY 
96.1557606 
3.5302605 
0.2029561 


0.1110226 
0.0000000 
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VARIABLE ù 
70.9422 
1.5266 
70.5844 
0.6434 

0. 0324 


CUMULATIVE 
PEECENTAGE 
96.1557606 
99.6080213 
99.8889774 
100.0000000 
100.0000000 


VARIABLE 5 
70.1358 
0.1108 
0.0250 
0.0324 
0.2626 


THE PRINCIPAL COMPONENT WEIGHTS (EIGENVECIORS) 
COMP. 1 CQuP. 2 


VARIABLE 
VARIABLE 
VARIABLE 
VARIABLE 
VARIABLE 


OBSVN 
OBSVN 
OBSVN 
OBSVN 
OBSVN 
OBSVN 
OBSVN 
OBSVW 
OBSVN 
OBSVN 
OBSVN 
OBSVN 
OBSVN 
OBSVR 
OBSVN 
OBSVN 
OBSVN 
OBSVR 
OBSVN 
OBSVN 


EVALUATIQN OE P.C. SCORES USING THE 08101845 0474 

cour. 1 Coup. 2 cour. з 
1 15.1551 2.6282 0.5882 
2 20.4333 1.4257 0.7069 
a 0.9699 2.7635 10.2235 
4 723.5316 4.7810 70.6848 
5 70.5608 4.5676 70.0625 
6 21.2143 0.5170 0.0648 
7 18.4882 71.5032 70.2806 
8 723.9905 0.6161 0.2360 
9 723.4364 70.9672 0.6640 
10 76.1437 1.5692 0.3598 
11 79.4842 3.1739 70.2701 
12 2.6554 72.1871 71.2135 
13 18.0147 75.8737 0.7836 
14 70.0904 74.7272 71.6652 
15 712.6893 71.7359 0.4975 
16 0.4412 70.5066 70.6333 
17 710.7233 73.7488 0.8468 
18 16.9245 70.6326 0.6275 
19 19.7050 0.1451 0.1023 
20 3.6783 70.3048 70.8438 


THE REMAINING 


THE MATRIX ОР COUBELATIONS BETWEEN THE VARIABLES AMD TUE Р.С.!5 
COMP. 1 CQMP. 2 биР. 3 


1 
2 
3 
" 
5 


семе. з СМЕ. + Coup. 5 

0.7849 70.2231 0.0272 0.0039 0.5774 

10.5871 70.5608 0.0861 0.0102 0.5774 
70.1979 0.7839 70.1133 70.0141 0.5774 
10.0068 70.1458 70.9800 70.1357 0.0000 
70.0008 70.0021 70.1368 0.9906 0.0000 


1 COLUMNS OF P.C. SCORES CONSIST ENTIRELY ОР ZEROBS 


coup. 4 семе. 5 
VARIABLE 1 0.9985 0.0544 0.0016 0.0002 0.0000 
VARIABLE 2 0.9836 70.1800 0.0066 0.0006 0.0000 
VARIABLE 3 10.7963 0.6045 10.0208 10.0019 0.0000 
VARIABLE 4 10.1270 10.5209 10.8397 70.0860 0.0000 
VARIABLE 5 0.0231 70.0118 70.1835 0.9827 0.0000 
THE ЕВОСОЕСТОНО ОЁ TUE VABIANCE ОР EACH VARIABLE 
ACCOUNTED РОБЕ BI EACH PEINCIPAL СОМРОВЕНТ 
COUP. 1 Coup. 2 СОР. з come. 4 coue. 5 
VARIABLE 1 0.9970 0.0030 0.0000 0.0000 0.0000 
VARIABLE 2 0.9675 0.0324 0.0000 0.0000 0.0000 
VARIABLE 3 0.6341 0.3654 0.0004 0.0000 0.0000 
VARIABLE һ 0.0161 0.2713 0.7051 0.0074 0.0000 
VARIABLE 5 0.0005 9.0001 0.0337 0.9657 0.0000 


BE SURE TO SAVE THIS WORKSPACE IF YOU WISH TO LATER 
USE THE PRINCIPAL COMPONENT WEIGHTS AND/OR SCORES. 
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Notes and Hints 


1) 


2) 


If it is decided during the data input phase to prevent PRINCIPAL from executing further, type 
QUIT and return the carriage. 


Missing values are allowed in the initial data matrix. For each missing value, enter 1 
The missing value for a variable from a unit will be replaced by the mean of the variable over 
all existing units. The resulting loss in degrees of freedom will be reflected in the estimate of 
the initial covariance or correlation matrix upon which the principal component analysis is based. 


3) Тһе зае setting functions TESTHY. or NOTES 7 and 5 
NOCANCORSCORES are not applicable to the principal component analysis. They are only relevant 
to the function CANOWICALCOR in this workspace (canonical correlation analysis). 

4) There is one global variable which always results from Р. ; its name is IGHTZ and 
it contains the principal component weights as they are given in the output. If 

SCORES is a chosen option, a second global variable is produced; it is called 
CSC and contains the N by K matrix of principal component scores, as produced in the 
output. Note that any all-zero columns of scores, although not explicitly printed, are still included 
in the matrix which is stored in 

5) The methods empployed to determine the two tables obtained through © : arc 
outlined in Johnston, pages 325-326, whereas a more general development of the theory behind 
principal components may be found in Morrison, Chapter 7 
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ERROR MESSAGES 
AND CORRECTIVE ACTION 


CHAR ERROR 


А CHAR ERROR is caused by an illegally overstruck character or an error in transmission. The readable 
portion of the line up to the character error is displayed, and you must retype the rest of that line. 


DOMAIN ERROR 


А DOMAIN ERROR will be encountered if the arguments of a function arc not in the domain of that 
function. Examples are: division by zero, #0, attempt to add numeric data to character data, 2+12", 
taking the log of zero, &0, or attempting to invert a singular matrix. In general, the corrective measure 
is to reformulate the instruction or correct the data. 


results if an index is used which is outside the range of the array (i.c., the element 
or elements being referred to are non-existent). Reformulaie the instruction to constrain the range 
of the indices to that defined by 24. where 4 is the variable in question 


LENGTH БЕРС 


A LENGTH wilt result if combining co-ordinates are not conformable in size. The remedy is 
to reformulate the expr 


ion 


OPER QUOTE 


An ОРЕХ QUOTE message is printed whenever an odd number of quote (7, upper case X) marks appear 
in a line of APL (exclusive of comments). Quote is used as a string delimiter for all character 
variables; the beginning and end of any text string must be signalled by this character 


When unmatched quotes are discovered, the error message is printed, as is the line of text you were 
attempting to define. The carriage is placed at the right-hand end of the input line, allowing you to 
type the required closing quote ог make other changes. Note that ' can be included as a character 
within a defined text string by typing '', so that the number of quotes contained in the entire string 
remains even. 


RANK ERROR 
A RANK ERROR will be encountered if the function is not defined for arrays of this rank, or if the 
ranks are not conformable for the function. Reformulate the expression using arrays of the proper 


rank which are conformable according to the function invoked. 


RESEND 


As a result of certain transmission errors, the message is sent hack, the carriage returns, and 
the keyboard unlocks. The line must then be retyped completely. 
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SI DAMAGE 


The state indicator contains a list of all suspended and pendant functions, with the most recent at 
the top. The list may be displayed by the system command )57. Damage to the state indicator, called 
I DAMAGE, occurs if an attempt is made to change the header of a suspended function, a line label, 
or any reference to a line label. In general, no recovery procedure is possible after this error. It is 
your responsibility to clear the state indicator, either partially, by typing + for each level to be cleared, 
or completely, by typing )RESET. 


SYNTAX ERROR 


In general, a SYNTAX ERROR occurs when an expression has been improperly formulated in АРІ. or 
a function improperly called. Common causes are: two variables juxtaposed with no APL function 
between them, an unmatched set of parentheses, or a single-argument function used with two argu- 
ments. 


VALUE Е: 


A VALUE | will result if ап attempt is made in an expression to use a variable whose value 
has not been previously specified. A common cause is misspelling of the variable name. Also a 
ү Е сап occur when attempting to use in an expression the name of a function which docs 
not return an explicit result. One possible corrective measure in the latter case is to redefine the 
function header to return an explicit result (eg, vFeX FON Y) 


Wf 


A M: report is given when the active workspace becomes overloaded. This commonly occurs 
when attempting to specify a variable which is too large то be stored in the active workspace, or when 
attempting Lo copy from a saved workspace objects for which there is not enough storage space left 
in the active workspace. Also, it may occur during the execution of a function or expression if the 
result of an intermediate calculation becomes too large to fit in the active workspace, even though the 
end result alone might not be large. 


А good corrective measure is to display the functions and variables list and erase those objects no longer 
needed. Also, the state indicator should be displayed by the command ) 97, to see which functions arc 
suspended, and undesired suspensions should be cleared by typing a right arrow for each occurrence 
of an asterisk (ж), or typing )RESET to completely clear the stack. As a last resort, it may be possible 
to revise the APL coding in the expression so that it will do the calculation in less space. 


Тһе system function GWA will return as a result the number of bytes unused in a workspace. This 
information may help to determine whether there is enough space for a variable to be stored. То 
determine this, count 8 bytes for each number in a numeri ble (of any rank) which contains 
at least one floating point number (i.e, any number with decimal digits), 4 bytes per number in a 
variable containing only integers, and 1 bit (8 һих=1 byte) per number in a variable containing only 
logical digits (i.e, 0 or 1). In a character variable, each character, including blanks. takes up 1 byte. 
This measure of space requirements is only approximate because of certain additional overheads, but 
suffices for most purposes. 


Fi 


OR, occurs when a line of APL code cannot be accom- 
ng code might 


A variation of И LL, called WS F 7 
modated in the workspace. Possible remedies include those mentioned above (but revis 
work only if the line becomes shorter). 
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GLOSSARY 


Active Workspace 


Тһе active workspace is the area you use for carrying out calculations and executing functions. One 
of its important properties is its size, which determines how much it may contain at any one time. 


Argument 
An argument is an APL array (which may be a variable or the result of an expression) used with 
a function. Monadic functions require one argument to the right of the function name, and dyadic 
functions require two arguments, one on either side of the function name. 
For example, the primitive dyadic function * requires two arguments: 

2*7 
Similarly, the dyadic function ЕР, which calculates frequencies, requires two arguments: 


HOW FR DATA 


where the left argument in this case must specify how to classify the data and the right argument 
supplies the data. 


On the other hand, reciprocal + is an example of a primitive monadic function requiring just one 
argument on the right: 


(Note that the vector 3 5, although consisting of 2 elements, is a single argument because it is an 
array.) 


Similarly, the program MEAN is also monadic, and requires a right argument which is the data: 
MEAN DATA 

Array 

An array is a structured collection of elements. An N-dimensional array has № co-ordinates (or axes) 

and thus its elements are selected by indices. For example, a one-dimensional array is a vector, a 

two-dimensional array is a matrix and a three-dimensional array consists of a series of planes, cach 

plane being a matrix. This is a analogous to a loaf of bread, where the slices are the planes, and 


each slice is a matrix. A scalar is a dimensionless array. 


For example, the array below is a matrix: 


with two rows and three columns. It is a two-dimensional array. 
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Assignment 
The assignment arrow + is used to give a name to an array for subsequent reference. 
For example, the expression 

Ae? 11 16 


can be read informally as "4 is 7 11 16" and the vector consisting of the three numbers 7 11 16 
can now be referred 10 as А. 


Dyadic 


The term dyadic refers to functions which have two arguments, and these must appear on either side 
of the function. 


For an example, read “Argument” above. 

Library 

An APL library is a collection of stored, inactive workspaces. 
Matrix 


A matrix is a two-dimensional array (or table) whose shape is given by the number of rows followed 
by the number of columns. Since it has two co-ordinates, its elements are selected by two indices. 


For example, M is a 3x4 matrix: 


M 
Б 22-25-07 
10 Ве 16: ob 
12-5: dr :В 
pM 
3 4 
Monadic 


The term monadic refers to functions which have one argument. This argument must be placed on 
the right-hand side of the function. 


For an example, read "Argument" above. 
Primitive Function 
Primitive functions are functions which are provided by the APL system and are immediately available 


on the keyboard for use. A primitive function is designated by a symbol or a combination of symbols 
which are neither alphabetic nor numeric. For example: + + е р + are all primitive functions. 
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Scalar 
A scalar is a numeric or character array which is dimensionless. It cannot be indexed. 
For example, the value 
5 
is a scalar. Note that the shape of this scalar is "nothing": 


05 


That is, nothing is printed but a blank line (indicating the null vector). Remember that since a scalar 
has no dimensions, it cannot be indexed. 


Shape 


The shape of an array is the number of elements along each co-ordinate. This indicates the size of 
the array, and the range of the indices which may be used to access elements. 


For instance, suppose X is a two-dimensional array, or matrix as follows: 


x 
7 8 12 
06% 


The shape of X is 2 (rows) Бу 3 (columns), and this "shape" information can be determined by typing 
the following: 


pX 
23 


On the other hand, suppose we have a vector Y: 


Y 
17 10 23 18.2 14.7 29 
Then 

рї 
6 


tells us the length of Y. 
Stored Workspace 


A copy of any active workspace can be saved for later use in a library and will then be called a stored 
workspace. 
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Variable 


A variable is an array with a name. The name can be up to 77 characters long (the first character 
must be alphabetic). When a conversational program requests numeric data, a variable may be input 
to refer to data stored previously. 

Vector 


A vector is а one-dimensional array (a sequence of numbers or of characters). It has one co-ordinate 
and hence its elements are selected by using one index. 


For example, 


y 
18 16 14 12 10 


shows Y is a vector of 5 numbers. 
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INDEX 


Almon lags, PDLAG, 143 

Analysis of covariance, АЛОСОУА, 301 

Analysis of variance, 272 
categorical data, CATANOVA, 78 
complete factorial, VWAYANOVAC, NWAYANOVAP, 287, 292 
completely randomized, WAYANOVAC, NWAYANOVAP, 287, 292 
Duncan's multiple range test, DUNCAN, 119 
factorial, crossed, nested, or cross-nested, ANOVA1, 273 
Friedman 2-way, FRIEDMAN, 82 
Kruskal-Wallis 1-way, XRUSKAL, 87 
Latin square, LATINSQC, LATINSQP, 279, 281 
multi-way, NWAYANOVAC, NWAYANOVAP, 287, 292 
one-way, with unequal subclasses, AVOVA2, 277 
split-plot or split-split-plot, SPLITPLOTC, 297 

Autocorrelation, method of Hildreth-Lu, HILDRETHALU, 175 
method of Cochrane-Orcutt, COCHRANEANORCUTT, 168 


Bartlett’s test 
conversational, BARTLETT, 118 
non-conversational, ВАРТ, 117 
Beta distribution, random number generation, PBETA, 254 
Binomial, frequency analysis of binomial data, 
FREQ, 58 
Binomial distribution 
confidence limits for P, PCONFINT, 98 
cumulative probability, BINOM, 206 
negative binomial, random number generation, РПЕСВІ 
probability, BINOMITALPROB, 207 
random number generation, PBINOMIAL, 256 
test, BINOMTEST, 77 


> 263 


Canonical Correlation analysis, CAVOWICALCOR, 325 
Categorical data, analysis of variance, CATANOVA, 78 
Cauchy distribution, random number generation, VCAUCHY, 246 
Chi-square distribution 
contingency tables, CHISQ, 15 
crosstabulation, СТАВ, 17 
non-central, random number generation, PNONCENTCHIS@, 266 
probability, CHISQUARKFROB, 210 
random number generation, PCHISQUARE, 257 
test, CHI, 13 
value given probability, CRISQUARE, 209 
Cochran’s Q-test, 2ТЕ57, 101 
Cochrane-Orcutt autocorrelation adjustment, COCHE, 
Combinations, COMB, 70 
Comparisons between means, Scheffe’s test, 
Concordance, Kendall coefficient of, CONC 
Contingency tables 
chi-square coefficient, CH 
crosstabulation, СТАВ, 17 
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Fisher exact probability, CHZSQ, 15 
Goodman-Kruskal gamma coefficient, GAMMA, 22 
Conversion to unlogged variables after regression, CONVERT, 171 
COPY, 9 
Correction of data, conversational, CORRECT, 40 
Correlation 
canonical correlation analysis, CAVOWICALCOR, 325 
coefficient of concordance, CONCORDANCE, 79 
partial, PARTCORR, 134 
point-biserial, PPBISERIAL, 99 
rank correlation, 
Kendall-tau, KENDALL, 84 
Spearman, RHO, 106 
simple 
CM, 130 
СОВМАТ, 132 
SCORR, 136 
Covariance, analysis of, AVOCOVA, 301 
Cox-Stuart test for trend, COXSTUART, 80 
Crosstabulation, CTAB, 17 
Curve fitting, see Regression, or Fit 


Deciles, VTILES, 67 

Descriptive statistics, DSTAT workspace, 23 

Discrete distribution, DISCRETE, 245 

Discriminant analysis, DISCRIM, 306 
stepwise option, STEPWISE, 314 

Duncan’s multiple range test, DUNCAN, 119 


Eigenvalues 
CANONICALCOR, 325 
DISCRIM, 306 
MARQUARDT, 155 
PRINCIPAL, 330 
STEPWISE, 314 
Entry of data, 39 
CORRECT, 40 
INPUT, 43 
Exponential distribution 
random number generation, NEXPONENTI. AL, 247 
Extreme, Type I distribution, random number generation, 
NEXTREME, 248 


F distribution 
non-central, random number generation, РМОП, 265 
probabilities, FPROB, 212 
random number generation, РЕ, 258 
Factorial analysis of variance 
ANOVA1, 273 
NWAYANOVAC, 287 
NWAYANOVAP, 292 
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Fisher exact probability for 2x2 contingency tables, 
CHISQ, 15 

Fit (see regression) 
negative binomial fit, NEGBIN, 64 


normal fits 
ENORMFIT, 53 
FREQ, 58 
NORMFIT, 66 


Formatting tables 
converstational, TABLE, 48 
non-conversational, ТАВ, 46 
Frequency analysis, 51 
frequency table production 
conversational, formatted, FREQ, 58 
non-conversational, non-formatted, FR, 55 
frequencies to raw scores, DIST, 52 
negative binomial, VEGEIN, 64 
normal fit, NORMFIT, 66 
equal probability classes, ENORMFIT, 53 
nztiles, calculation, NTILES, 67 
Friedman 2-way analysis of variance, FRIEDMAN, 82 


Gamma coefficient, GAMMA, 22 
Gamma distribution, random number generation, PGAMMA, 259 
Generalized least squares, 625, 173 
Gompertz equation, calculation of parameters and statistics, GOMPERTZP, 152 
Geometric distribution 
hypergeometric probabilities, HYPERGEOPROB, 215 
hypergeometric, random number generation, PHYPER, 261 
random number generation, PGEOMETRIC, 260 
Goodman-Kruskal gamma coefficient, GAMMA, 22 
GRP, 10 


Hildreth-Lu autocorrelation adjustment, HILDRETHALU, 175 
Homogeneity of variance, Bartlett’s test 

conversational, BARTLETT, 118 

non-conversational, BART, 117 
Hypergeometric distribution 

probabilities, HYPERGEOPROB, 215 

random number generation, PHYPER, 261 


Input, conversational 
INPUT, 43 
CORRECT, 40 
Instrumental variables, INSTRUMENTAL, 178 


Kendall coefficient of concordance, CONCORDANCE, 79 
Kendall-tau rank correlation, 

KENDALL, 84 
Kurtosis, calculation of, KURTOSIS, 29 


Kruskal-Wallis 1-way analysis of variance, KRUSKAL, 87 
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Laplace distribution, random number generation, VLAPLACE, 249 
Lagged variables, 142 

Almon lags (polynomial distributed), PDLAG, 143 

Shiller lags, SHILLER, 147 
Latin square, analysis of variance, LATINSQC, LATINSQP, 279, 281 
Least squares estimation 

linear, see Regression 

non-linear, see Non-linear regression 
Linear regression, see Regression 
LOAD, 9 
Logistic distribution, random number generation, VLOGISTIC, 250 
Lognormal distribution, random number generation, PLOGNORMAL, 262 


Mann-Whitney 
U-statistic, calculation, MANN, 89 
U-test, UTEST, 113 
MARQUARDT, 155 
MARQUARDTP, 157 
Matches, probabilities, MATCHPROB, 217 
McNemar test for changes, МСМЕМАВ, 91 
Mean 
08, 24 
DSTAT, 26 
MEAN, 30 
Mean deviation 
DSTAT, 26 
MEANDEV, 31 
Means comparisons, Scheffe’s test, SCHEFFE, 121 
Means tables and profiles, MEANSP, 283 
Median 
calculation of 
DSTAT, 26 
MEDIAN, 32 
NTILES, 67 
Median test 
2 groups, MEDI AN2TEST, 95 
several groups, MEDI ANTEST, 93 
Mode, calculation of, MODE, 33 
Moses test of extreme reactions, MOSES, 96 
Moving average, SMOOTH, 73 
Multivariate Analysis, 304 
Canonical correlations, CAVONICALCOR, 325 
Discriminant analysis, DISCRIM, 306 
stepwise option, STEPWISE, 314 
Principal components, PRINCIPAL, 330 


Negative binomial 
fitting to the negative binomial, NEGBIN, 64 
probabilities, VBZN, 218 
random number generation, PYEGBIN, 263 
Non-central distributions, random number generation 
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chi-square, PYONCENTCHISQ, 266 
Е, PNON, 265 
t, PNONCENTRALT, 267 
Non-linear regression, 151 
GOMPERTZP, 152 
MARQUARDT, 155 
MARQUARDTP, 157 
Nonparametric statistics, 75 
analysis of variance on categorical data, CATANOVA, 78 
binomial test, BINOMTEST, 77 
Cochran's Q-test, QTEST, 101 
correlation 
Kendall-tau, KENDALL, 84 
point-biserial, PPBISERIAL, 99 
Spearman, ЕНО, 106 
Ciox-Stuart test for trend, COXSTUART, 80 
Friedman 2-way analysis of variance, FRIEDMAN, 82 
Kendall coefficient of concordance, CONCORDANCE, 79 
Kendall-tau rank order correlation coefficient, KENDALL, 84 
Kruskal-Wallis 1-way analysis of variance, KRUSKAL, 87 
Mann-Whitney U-statistic 
computation, MANN, 89 
test, UTEST, 113 
McNemar test for changes, MCNEMAR, 91 
median test 
2 groups, MEDI AN2TEST, 95 
several groups, MEDIANTEST, 93 
Moses test of extreme reactions, MOSES, 96 
normal, percentile and ordinate, NDTRI, 97 
point-biserial correlation, PTBISERIAL, 99 
Q-test, Cochran's, QTEST, 101 
runs 
number of runs, RUNS, 108 
test for randomness, RUNSTEST, 109 
test for runs above & below median, RANDOMTEST, 103 
Wald-Wolfowitz, WALD, 114 
rank, computation, RANK, 105 
sign test, SIGNTEST, 111 
Spearman rank correlation, RHO, 106 
Wald-Wolfowitz runs test, WALD, 114 
Normal distribution 
ordinate+cumulative probability, GAUSS, 214 
ordinate of normal, WORMORD, 224 
percentile and ordinate of normal, NDTRI, 97 
probabilities, VORMALPROB, 222 
probabilities by classes, NORMAL, 220 
random number generation, NVORMAL, 251 
N-tiles, МТІТЕЗ, 67 


Partial correlation, PARTCORAR, 134 
PCOPY, 9 

Permutations, PERMUTE, 71 
Poisson 
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classes, POISSON, 229 
cumulative, РОТРВОВС, 227 
cumulative probability»ordinate, РОГУАР, 235 
individual, POZPROBI, 228 
individual and cumulative probabilities, POIPROB, 225 
labelled table, POITABLE, 233 
probabilities for a vector and tails, 
POISSONPROB, 231 
random number generation, PPOISSON, 268 
Polynomial distributed lag, PDLAG, 148 
Pooled variance, after Bartlett’s test 
conversational, BARTLETT, 118 
non-conversational, BART, 117 
Prediction of values after regression, PREDICT, 181 
Principal components analysis, PRINCIPAL, 330 
Probability distributions, 204 
binomial, BINOMIALPROB, 207 
cumulative, BINOM, 206 
chi-square 
value»probability, CHI SQUAREPROB, 210 
probability»value, CHISQUARE, 209 
F, FPROB, 212 
hypergeometric, НУРЕЯСЕОРНОВ, 215 
matches, MATCHPROB, 217 
negative binomial, NBIN, 218 
normal 
calculates probabilities, VORMALPROB, 222 
ordinate of the normal, NORMORD, 224 
probabilities for classes, NORMAL, 220 
Z for given probability, GAUSS, 214 
runs, RUNSPROB, 236 
signed-rank, SIGNEDRANKPROB, 238 
t-distribution 
STUDENT, 240 
TPROB, 241 
Poisson distribution 
cumulative probabilities, РОТРЕОВС, 227 
cumulative probabilities - ordinate, POIVAL, 235 
individual and cumulative probabilities, POIPROB, 225 
individual probabilities, POZPROBI, 228 
probabilities for classes, POISSON, 229 
probabilities for vector and tails, POZSSONPROB, 231 
random number generation, PPOISSON, 268 
table output of probabilities, POTABLE, 233 
Probit analysis, PROBIT, 163 


Quartiles, NTILES, 67 
Q-test, Cochran's, QTEST, 101 


Random link definition, RANDOMIZE, 270 
Random number generation, 243 
binomial distribution, PBINOMIAL, 256 
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beta distribution, PBETA, 254 
Cauchy distribution, NCAUCHY, 246 
chi-square distribution, PCHISQUARE, 257 
discrete distribution, DISCRETE, 245 
exponential distribution, VEXPONENTIAL, 247 
extreme distribution, NEXTREME, 248 
Е distribution, РР, 258 
gamma distribution, РСАММА, 259 
geometric distribution, PGEOMETRIC, 260 
hypergeometric distribution, PHYPER, 261 
Laplace distribution, VLAPLACE, 249 
logistic distribution, NLOGISTIC, 250 
lognormal distribution, PLOGNORMAL, 262 
negative binomial distribution, PVEGBIN, 263 
noncentral chi-square distribution, PYONCENTCHISQ, 266 
noncentral Е distribution, РОЛ, 265 
noncentral t distribution, PYONCENTRALT, 267 
normal distribution, VWORMAL, 251 
Poisson, PPOISSON, 268 
t distribution, P7, 269 
uniform distribution, NUNIFORM, 253 
RANGE, 34 
Regression, 167 
Cochrane-Oreutt, COCHRANEAORCUTT, 168 
conversion to unlogged variables, CONVERT, 171 
generalized least squares, GLS, 173 
Hildreth-Lu, HILDRETHALU, 175 
instrumental variables, INSTRUMENTAL, 178 
large regression problems, REGRESS, 138 
multiple regression, REGR, 183 
non-linear, see non-linear regression 
ordinary least squares, REGR, 183 
polynomial distributed lag, PDLAG, 143 
prediction of expected values, PREDICT, 181 
Shiller lag analysis, SHILLER, 147 
stepwise, see stepwise regression 
three-stage least squares, 57АСЕЗ, 188 
two-stage least squares, STAGE3, 188 
Rounding, ROUND, 72 
Runs 
number of RUNS, 108 
probability, RUNSPROB, 236 
test for randomness 
RUNSTEST, 109 
RANDOMTEST, 103 
Wald-Wolfowitz, WALD, 114 


Scheffe’s test of means comparisons, SCHEFFE, 121 
Shiller lags, SHILLER, 147 
Sign test, SIGNTEST, 111 
Signed rank test probabilities, SIGNEDRANKPROB, 238 
Simple correlation 

cM, 130 
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CORMAT, 132 
SCORR, 136 
Simultaneous linear equations, three-stage least squares, 5ТАСЕЗ, 188 
SKEWNESS, 35 
Smoothing, SMOOTH, 73 
Spearman rank correlation test, RHO, 106 
STAGES, 188 
Standard deviation 
DS, 24 
DSTAT, 26 
STDDEV, 36 
Standard error 
DSTAT, 26 
STDERR, 37 
Stepwise regression, 193 
STEPWISEC, 194 
STEPWISEFC, 199 
Student’s t distribution, see T distribution 
Surface fitting 
linear, see Regression 
non-linear, see Non-linear Regression 


Table preparation 
conversational, TABLE, 48 
non-conversational, ТАВ, 46 
Т distribution 
non-central, random number generation, PVONCENTRALT, 267 
random number generation, РТ, 269 
probabilities 
probability+value, STUDENT, 240 
value+probability, TPROB, 241 
Three-stage least squares, STAGE3, 188 
T-test, 123 
calculation of t statistics and test, TTEST, 126 
test, TPOT, 124 
Two-stage least squares, STAGE3, 188 
Type I extreme value distribution 
random number generation, VEXTREME, 248 


Uniform distribution, random number generation, NUNIFORM, 253 
U-statistic, Mann-Whitney, calculation, MANN, 89 
U-test, Mann-Whitney, UTEST, 113 


Variance 
DS, 24 
DSTAT, 26 
VAR, 38 


Wald-Wolfowitz runs test, WALD, 114 
Weighted moving average, SMOOTH, 73 
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